Agent SkillsAgent Skills
northseadl

image-studio

@northseadl/image-studio
northseadl
5
1 forks
Updated 4/1/2026
View on GitHub

AI image generation and editing: e-commerce templates (hero/banner/detail/lifestyle), image refinement (background replace/remove, enhance, retouch, style transfer), icon extraction (bg-removal + detect + smart crop with transparent output).

Installation

$npx agent-skills-cli install @northseadl/image-studio
Claude Code
Cursor
Copilot
Codex
Antigravity

Details

Pathimage-studio/SKILL.md
Branchmain
Scoped Name@northseadl/image-studio

Usage

After installing, this skill will be available to your AI coding assistant.

Verify installation:

npx agent-skills-cli list

Skill Instructions


name: image-studio metadata: version: 0.1.3 description: 'AI image generation and editing: e-commerce templates (hero/banner/detail/lifestyle),

image refinement (background replace/remove, enhance, retouch, style transfer),

icon extraction (bg-removal + detect + smart crop with transparent output).

'

Nano Banana — AI Image Generation & Editing

Unified CLI: ./nanobanana <module> [options] Full API reference: references/api_reference.md

First-Use Detection

Before executing any command, verify credentials:

Path 1: Environment Variable (Agent preferred, zero setup)

If any of GEMINI_API_KEY, GOOGLE_API_KEY, or NANOBANANA_API_KEY is set, no init needed — the client resolves automatically.

Path 2: Non-Interactive Init (Agent usable)

Ask user for their Google API key, then run:

./nanobanana auth init --api-key "<their-gemini-api-key>"

This verifies connectivity and persists credentials to ~/.nanobanana/credentials.json (chmod 600).

How to get an API Key

  1. Go to Google AI Studio
  2. Create API key
  3. Copy (the key typically starts with AIza)

After init succeeds, credentials auto-persist. No further user action needed.

Available Models

Gemini Model IDAliasMax ResBest For
gemini-2.5-flash-imageNano Banana1024pxSpeed + efficiency
gemini-3.1-flash-image-previewNano Banana 22KDefault. Speed + quality
gemini-3-pro-image-previewNano Banana Pro4KStudio quality, 4K, Thinking

Shortcut aliases: nano-banana-2 (default), nano-banana-pro, nano-banana, flash, pro, fast.

Users can override the model with --model <name>. Default is nano-banana-2 (Gemini 3.1 Flash Image).

Workflow 1: E-commerce Image Generation

./nanobanana generate "<product description>" [options]

Templates

Templates auto-enhance prompts with professional photography directives:

TemplateRatioSizeUse Case
hero1:12KProduct hero shot on white background
banner16:92KPromotional marketing banner
detail3:42KClose-up material/texture shot
lifestyle4:32KProduct in aspirational scene
grid1:12KFour-panel detail page showcase
social4:51KSocial media / Instagram post

Icon Templates (extraction-optimized)

Icon templates use the surrounding prompt pattern — the user's description is wrapped between a layout-constraining prefix and a style-constraining suffix. This forces the model to produce structured, grid-aligned output that the extraction pipeline can reliably split.

TemplateGridRatioSizeUse Case
icon1×11:11KSingle icon, centered, 80% fill
icon-set3×31:12K9 related icons in uniform grid
icon-set-63×23:22K6 related icons in landscape grid
icon-set-42×21:12K4 related icons in grid
icon-variants2×21:12KSame icon in 4 visual styles

Examples

# Product hero shot with white studio background
./nanobanana generate "luxury leather handbag, brown Italian calfskin" --template hero

# Marketing banner for a campaign
./nanobanana generate "summer skincare collection" --template banner --model nano-banana-pro

# Lifestyle scene
./nanobanana generate "minimalist desk setup with mechanical keyboard" --template lifestyle

# Custom prompt with explicit parameters
./nanobanana generate "organic tea packaging in zen garden setting" --ratio 16:9 --size 2K

# With reference images (automatically switches to image-to-image)
./nanobanana generate "same style but in blue colorway" --ref-image product_red.jpg

# Save with specific name and directory
./nanobanana generate "wireless earbuds" --template hero -o ./output -n earbuds_hero

# --- Icon Templates ---
# Single icon: centered, white bg, 80% canvas fill
./nanobanana generate "settings gear" --template icon

# 9-icon set in 3×3 grid (then split with icon-extract)
./nanobanana generate "social media: like, comment, share, save, follow, message, search, notification, profile" --template icon-set

# 4-icon set in 2×2 grid
./nanobanana generate "e-commerce: cart, payment, delivery, return" --template icon-set-4

# Same icon in 4 styles (flat, line-art, filled, 3D)
./nanobanana generate "cloud computing" --template icon-variants

Key Options

OptionDescription
--templateE-commerce template (auto-sets ratio/size/prompt)
--modelModel override (default: nano-banana-2)
--ratioAspect ratio override
--sizeResolution override (1K/2K/4K)
--formatOutput format (png/jpeg/webp)
--ref-imageReference image file (repeatable, up to 14 for Gemini 3+)
--rawUse prompt as-is, skip template enhancement
-o/--outputOutput directory
-n/--nameOutput filename (without extension)

Workflow 2: Image Refinement

./nanobanana refine <input-image> [prompt] [options]

Templates

TemplateNeeds PromptUse Case
bg-replaceReplace background with specified scene
bg-whiteWhite e-commerce background
bg-removeChromakey green for compositing
enhanceQuality boost (sharp, color, noise)
retouchNatural portrait retouching
object-swapReplace specific object in scene
style-transferApply artistic style
color-gradeProfessional color grading
extendExtend/outpaint canvas

Examples

# White background for e-commerce
./nanobanana refine product.jpg --template bg-white

# Replace background
./nanobanana refine product.jpg "modern kitchen counter with marble surface" --template bg-replace

# General quality enhancement
./nanobanana refine photo.jpg --template enhance --size 4K --model nano-banana-pro

# Portrait retouching
./nanobanana refine portrait.jpg --template retouch

# Object swap
./nanobanana refine room.jpg "the wooden table with a glass coffee table" --template object-swap

# Style transfer
./nanobanana refine photo.jpg "Japanese woodblock print" --template style-transfer

# Free-form editing without template
./nanobanana refine photo.jpg "make the lighting warmer and add golden hour glow"

Workflow 3: Icon Extraction

./nanobanana icon-extract <input-image> [options]

Full pipeline: background removal → detection (grid or auto) → smart crop → individual RGBA PNGs + manifest.

Detection Strategies (--layout)

StrategySyntaxBest For
auto--layout autoUnstructured images, unknown layout (default)
grid--layout grid-CxRTemplate-generated sprite sheets (reliable!)

Why grid mode is critical: When using icon-set / icon-set-4 / icon-set-6 templates, the model is prompted to produce a grid layout. Grid splitting (--layout grid-3x3) is deterministic and far more reliable than component analysis — it divides the image into N×M equal cells, then refines each cell's bounding box by scanning the alpha channel for actual content. Empty cells are discarded.

Key Options

OptionDefaultDescription
-o / --output.Output directory
--prefixiconFilename prefix (→ icon_001.png, icon_002.png, ...)
-l / --layoutautoDetection: auto or grid-CxR (e.g., grid-3x3)
--modelbirefnet-general-literembg model for bg removal
--skip-bg-removaloffSkip bg removal (input already transparent)
--padding0.10Padding ratio around each icon (0.10 = 10%)
--min-size24Minimum bounding box side (px) to qualify as icon
--merge-distance0.05Merge threshold for auto mode (ratio of image diagonal)
--output-size512Export size (square px). Use 0 to keep original size
--no-squareoffKeep original aspect ratio instead of squaring
--keep-intermediateoffSave the background-removed intermediate image

Examples

# Grid split — for template-generated sprite sheets (most reliable)
./nanobanana icon-extract sprite_3x3.png --layout grid-3x3 -o ./icons/
./nanobanana icon-extract quad.png --layout grid-2x2 --output-size 256

# Auto detect — for unstructured images
./nanobanana icon-extract icon_sheet.png -o ./icons/

# Skip bg removal (already transparent)
./nanobanana icon-extract transparent.png --skip-bg-removal --layout grid-3x2

# Debug: keep intermediate bg-removed image
./nanobanana icon-extract app_icons.png --model birefnet-general --keep-intermediate

End-to-End Chaining: Generate → Extract

The most reliable icon workflow chains generation with extraction:

# Step 1: Generate 9-icon sprite sheet
./nanobanana generate "navigation: home, search, favorites, cart, profile, settings, notifications, messages, help" --template icon-set

# Step 2: Extract with matching grid
./nanobanana icon-extract generated_image.png --layout grid-3x3 -o ./icons/

The generate output prints a layout hint (e.g., Layout: 3x3 grid) — the Agent should automatically chain to icon-extract with the matching --layout grid-CxR.

Agent Decision Guide

The icon pipeline is a default tool — the Agent decides autonomously:

ScenarioAgent Action
"生成9个App图标"generate --template icon-seticon-extract --layout grid-3x3
"生成4个图标"generate --template icon-set-4icon-extract --layout grid-2x2
"生成单个图标"generate --template icon (no extraction needed)
User provides icon sheet with known gridicon-extract --layout grid-CxR directly
User provides unstructured imageicon-extract (auto mode)
"图标风格对比"generate --template icon-variantsicon-extract --layout grid-2x2

Intent → Command Mapping

E-commerce Generation

User IntentCommand
"生成商品主图"generate "<描述>" --template hero
"做一个促销 Banner"generate "<描述>" --template banner
"产品细节图"generate "<描述>" --template detail
"生活场景图"generate "<描述>" --template lifestyle
"详情页四宫格"generate "<描述>" --template grid
"社交媒体图片"generate "<描述>" --template social
"高清4K产品图"generate "<描述>" --template hero --size 4K --model nano-banana-pro
"AI图片" (generic)generate "<描述>" (no template)

Icon Generation + Extraction

User IntentCommand
"生成单个图标"generate "<描述>" --template icon
"生成9个图标集"generate --template icon-seticon-extract --layout grid-3x3
"生成6个图标"generate --template icon-set-6icon-extract --layout grid-3x2
"生成4个图标"generate --template icon-set-4icon-extract --layout grid-2x2
"图标风格对比"generate --template icon-variantsicon-extract --layout grid-2x2
"提取图标"icon-extract <img> -o ./icons/
"切割图标/icon sheet"icon-extract <img> --layout grid-NxM
"去背景+裁切"icon-extract <img>
"小图标256px"icon-extract <img> --output-size 256
"保持原比例"icon-extract <img> --no-square

Image Refinement

User IntentCommand
"换白色背景"refine <img> --template bg-white
"换背景/抠图"refine <img> "<new bg>" --template bg-replace
"去背景"refine <img> --template bg-remove
"图片增强/精修"refine <img> --template enhance
"人像修图"refine <img> --template retouch
"替换物品"refine <img> "<swap>" --template object-swap
"风格迁移"refine <img> "<style>" --template style-transfer
"调色"refine <img> "<mood>" --template color-grade
"扩展画面"refine <img> "<direction>" --template extend

Auth Management

CommandAction
auth init --api-key <KEY>Store and verify API key
auth init --api-key <KEY> --no-verifyStore without verification
auth statusShow current auth configuration
auth cleanRemove stored credentials

Error Recovery

SituationAgent Action
No API keyTrigger First-Use Detection flow
401 Invalid keyauth status, guide to https://aistudio.google.com/apikey
403 Permission deniedCheck API key restrictions / project settings
429 Rate limitedWait and retry
File not foundVerify path, suggest alternatives
File too large (>20MB)Suggest resizing before upload
IMAGE_SAFETY blockRephrase prompt, avoid safety-filtered content
No image in responseCheck finishReason, may need prompt adjustment

Structure

image-studio/
├── nanobanana             ← Unified CLI entry point (bash)
├── SKILL.md               ← This file (Agent instruction)
├── scripts/
│   ├── nanobanana_api.py  ← Core engine (Google Gemini API client)
│   ├── auth.py            ← Credential management (init / status / clean)
│   ├── generate.py        ← E-commerce image generation (6 templates)
│   ├── refine.py          ← Image refinement and editing (9 templates)
│   └── icon_extract.py   ← Icon extraction pipeline (bg-removal + detect + crop)
├── evals/
│   └── evals.json         ← Test cases for skill evaluation
└── references/
    └── api_reference.md   ← Full API parameter reference