Text to Image
Shibuya Rooftop Selfie Photography — AI Image Prompt
A highly detailed Japanese prompt for gpt-image-2 designed to generate realistic or illustrative selfies of a character from a high-rise rooftop overlooking the Shibuya Scramble Crossing, emphasizing composition and lighting. - AIPinMaker

Prompt
参照画像の人物を、{argument name="撮影場所" default="渋谷スクランブル交差点"}を正面下方に見下ろせる{argument name="建物" default="高層商業ビルの屋上展望デッキ"}で、スマートフォンを使ってセルフィーを撮っている瞬間として描写してください。スマホ自身の厳密な撮影画面ではなく、その瞬間を近距離の第三者カメラが捉えたセルフィー風構図です。
参照画像の顔立ち、目、髪型、髪色、年齢感、衣装の色・形・素材・装飾、人物の雰囲気を維持し、別人化させないでください。元の背景、照明、構図、ポーズは継承しません。
人物を唯一の主役とし、顔から太もも付近までの安定した3/4身構図にしてください。全身を無理に入れず、顔、上半身、衣装、スマホ、背景の再現性を優先します。
人物は安全な屋上展望デッキに立ち、片腕を前方へ伸ばしてスマホを持ち、その画面へ自然に視線を向けます。スマホは前景にやや大きく見せますが、顔や衣装を隠さず巨大化させません。反対の腕は身体の近くに自然に配置してください。
画面下端または左右の端に、屋上床と透明なガラス柵を少量だけ見せ、撮影場所を明確にしてください。屋上部分は画面の10%以内。空中浮遊、落下、危険な縁立ちは避けてください。
カメラは人物の少し前方かつ上方から、渋谷を見下ろす斜め俯瞰。完全な真上視点にはしません。人物の背後中央から下方にスクランブル交差点を大きく配置し、幅広い複数方向の横断歩道、交差点中央、スクランブル横断中の多数の歩行者を明確に見せてください。人をまばらにせず、中央も空白にしません。
周囲には高密度な商業ビル、曲面ガラス、店舗ファサード、大型デジタルビジョンを近く大きく配置し、少なくとも3面のビジョンを見せてください。ビジョンには青、シアン、赤、マゼンタ、ピンクを含む鮮やかな抽象映像や人物シルエットを表示し、読める文字、企業名、実在ロゴは出さないでください。
時間帯は柔らかな自然光の午後。道路は自然なニュートラルグレー、横断歩道は明確な白。背景を衣装色に合わせて低彩度化、白色化、単色化せず、群衆、商業ビル、ビジョン、窓の反射による情報量と色彩で渋谷の華やかさを一定に保ってください。白飛び、過度なHDR、夕焼け、夜景、霧、青白い色かぶりは避けます。人物の顔と衣装は背景よりわずかに鮮明にしてください。
実写参照は高品質な実写のまま、アニメ・イラスト参照は元の絵柄、線、彩色、質感、顔の比率を維持してください。実写とアニメを相互変換しません。
余分な手足や指、不自然なスマホ、身体の接続崩れ、一般的な交差点への変更、空中浮遊、落下表現、読める広告文字、実在ロゴを避けてください。3d:Ta91,Describe the person in the reference image at the {argument name="setting" default="rooftop observation deck of a high-rise commercial building"} overlooking the {argument name="location" default="Shibuya Scramble Crossing"} directly below and in front, capturing the moment they are taking a selfie with a smartphone. This is a selfie-style composition captured by a close-range third-person camera, not the actual phone screen view.
Maintain the facial features, eyes, hairstyle, hair color, age, and the color, shape, material, and decoration of the outfit from the reference image to ensure consistency. Do not inherit the original background, lighting, composition, or pose.
Make the person the sole focus in a stable 3/4 body shot from the face to around the thighs. Prioritize the reproduction of the face, upper body, outfit, smartphone, and background over showing the full body.
The person stands on a safe rooftop deck, extending one arm forward holding the smartphone and looking naturally at the screen. The smartphone should appear somewhat large in the foreground without obscuring the face or outfit. Place the other arm naturally near the body.
Show a small portion of the rooftop floor and a transparent glass fence at the bottom or sides of the frame to clarify the location. The rooftop should occupy less than 10% of the screen. Avoid themes of floating, falling, or standing on dangerous edges.
The camera is slightly ahead and above the person, providing a diagonal high-angle view of Shibuya. Do not use a top-down view. Place the Scramble Crossing prominently below and behind the center of the person, showing wide crosswalks in multiple directions, the center of the intersection, and numerous pedestrians. Do not leave the center empty or sparse.
Surround the scene with high-density commercial buildings, curved glass, storefront facades, and large digital billboards, showing at least three screens. The screens should display vibrant abstract visuals in blue, cyan, red, magenta, and pink without readable text or real brand logos.
The time is a soft natural light afternoon. RoadPrompt breakdown
- Subject
- Reference person taking a selfie with smartphone on Shibuya rooftop overlooking Scramble Crossing
- Style
- Third-person selfie-style realistic photography
- Lighting
- Soft natural afternoon light
- Composition
- Diagonal high-angle 3/4 body shot from face to thighs with Scramble Crossing and at least three digital billboards prominent behind the figure
- Mood
- Lively urban atmosphere with dense commercial buildings, curved glass, and colorful abstract digital displays
Remix ideas
- Shift the camera a few degrees lower to reveal more of the transparent glass railing at the frame edge
- Swap the billboard visuals to include more magenta silhouettes while keeping all content abstract and text-free
- Tilt the reference person's head slightly for a more natural gaze toward the phone screen
Reference images

How to use this AI Image prompt template
1
Copy the prompt — grab this template’s prompt and negative prompt. 2
Pick a model — choose a recommended AI model for the best match. 3
Generate — open the studio with one click and create your result.
Related templates

High-Energy Character Transformation with Extreme Perspective - Nano Banana Pro AI Prompt for Social Media Post
{ "reference_priority": { "character_face": "STRICT_REFER_TO_IMAGE", "outfit_and_hair_logic": "FORCE_EXACT_REPLICA_FROM_REFERENCE", "footwear_logic": "FORCE_EXACT_REPLICA_FROM_REFERENCE", "consistency_weight": "MAXIMUM" }, "subject": { "type": "woman_identity_perfectly_matched_to_reference_image", "framing": "extreme_high-angle_bird's-eye_view_full-body_shot", "identity_lock": "maintaining_identical_facial_features_from_reference", "features": { "eyes": "looking_up_at_camera_with_a_bright_friendly_gaze", "hair": "EXACT_REPLICATE_HAIRSTYLE_FROM_REFERENCE: length, color, texture, and style must be identical, no headwear", "expression": "cheerful_and_cute_expression_with_a_playful_smile" }, "pose_structural_lock": { "overall": "standing_confidently_at_the_highest_point_of_a_rugged_mountain_peak", "arms": "right_hand_raised_near_the_eye_making_a_peace_sign_V-sign_YA_gesture", "hands": "fingers_clearly_formed_into_a_peace_sign_near_the_face", "shoulders": "slightly_slouched_creating_a_top-down_foreshortening_effect", "perspective": "heavy_wide-angle_distortion_making_the_head_look_larger_than_feet" } }, "apparel_specification": { "logic": "CLOTHING_AND_FOOTWEAR_MUST_BE_AN_EXACT_CLONE_OF_REFERENCE_IMAGE", "outfit_main_piece": { "top": "Identical_inner_layer_as_seen_in_reference", "bottom": "Identical_pants_from_reference_image", "footwear": "EXACT_REPLICATE_FOOTWEAR_FROM_REFERENCE: replicate the specific shoes, colors, and design from the reference image exactly" } }, "environment": { "setting": "the_summit_of_a_high_mountain_with_jagged_rocks_and_a_cliff_edge", "lighting": "bright_natural_daylight_with_soft_clouds_below", "background": "monumental_cloud_formations_in_the_distant_sky_arranged_clearly_to_spell_out_the_word_'BeautyVerse (background word)'_hovering_above_mountain_ranges", "atmosphere": "majestic_and_adventurous_high-altitude_vibe_with_surreal_elements" }, "realism_and_rendering": { "style": "adventure_outdoor_photography_with_surrealist_elements", "camera": "Ultra-wide_angle_lens_shot_from_above_height_emphasizing_the_height_of_the_peak", "image_quality": "8k_resolution_hyper-realistic_rock_textures_and_fabric_weave", "aspect_ratio": "3:4" } }

Ultra-Wide Urban Sneaker Portrait
Use my uploaded portrait only for facial identity and hairstyle. Create a realistic wide-angle urban fashion photo inspired by the reference: a person crouching in the middle of a modern city street, shot from an extremely low camera angle with strong perspective distortion. One hand reaches toward the lens in the foreground, and one white sneaker sole is very close to the camera, oversized due to the wide-angle lens. Tall glass skyscrapers rise on both sides, cloudy overcast sky, cinematic street atmosphere, natural daylight, slightly raw smartphone-photo realism. Keep my face recognizable, realistic facial structure, natural skin texture, real hair details, no plastic skin, no anime style. Outfit should be casual and soft-toned, similar to the reference: loose light shirt (top style), relaxed pants (pants style), white sneakers. Dynamic pose, dramatic depth, realistic shadows, coherent lighting, high-detail city background, editorial street photography, 2:3 vertical composition, no text, no logo, no watermark. Negative Prompt: anime, cartoon, doll face, plastic skin, over-smoothed skin, fake face, face swap look, distorted identity, unrealistic eyes, bad hands, extra fingers, broken limbs, deformed shoe, messy perspective, low quality, blurry face, watermark, logo, text, signature

Tibetan Prayer Flag Highland Selfie
Create an ultra-wide-angle highland travel selfie in a Chinese Tibetan cultural aesthetic, shot from a dramatic low front perspective with the subject reaching one open hand toward the camera so the palm and fingers are very large in the foreground with strong depth distortion. The traveler is centered under a vast circular canopy of Tibetan prayer flags arranged in concentric rings and radial lines, with bright red, blue, yellow, green, and white fabric squares forming a mandala-like spiral overhead against a vivid blue sky. The sun flares through the flags at the upper left, creating sparkling highlights and high-contrast daylight. The subject wears a bright turquoise blue (jacket color) outdoor shell jacket, white pants, a black backpack strap, and a mustard yellow (hat color) knit beanie with sunglasses resting on top; long dark hair is visible. Place the scene on a grassy highland meadow with distant low hills and rows of prayer flags stretching across the background. Add falling white snowflakes or windblown ice particles throughout the frame, some large and blurred close to the lens, giving an adventurous alpine travel feeling. Use realistic photography, action-camera perspective, 14mm fisheye lens look, dynamic foreshortening, crisp saturated colors, natural skin tones, cinematic sunlight, sharp center details, shallow foreground blur, and an energetic social-media travel selfie composition. The subject’s face should appear as a soft anonymized blur or replaceable portrait area for the uploaded person (traveler identity).38

Three-Person Merged Top-Down Selfie
不要换脸,人物脸部不能有任何改变!请将图1和图2和图3的三人融合成一张三人俯拍自拍照,画面构图紧凑,三位主体靠得很近,头部略微上仰,眼神直视镜头,营造出强烈的视觉冲击力。图1在中间人物站得略靠前,需要保持人脸相似度;需要保持人脸相似度,略微内扣身体,拍摄角度为高角度俯拍,使头部比例被夸张放大,符合典型的日韩视觉自拍风格。简洁干净,进一步凸显人物主体。画面风格偏向日系视觉系,整体画面清晰度高,用iphone前置自拍,最终呈现出精致、时尚、略带的合影效果。要求人物实现无缝融进画面,视觉过渡自然,整体画面光线明亮且均匀,背景为超大电影院坐满了观众,两人背对银幕,银幕显示《疯狂动物城2 (电影海报名称)》电影精美海报填满屏幕。

Coquette Aesthetic High-Angle Selfie - Nano Banana Pro AI Prompt for Social Media Post
{ "image_prompt_data": { "subject": { "demographics": "Young woman, fair complexion, roughly 20s", "hair": "Long, dark brown hair with loose waves, center part, draping over shoulders", "eyes": "Blue-grey eyes, direct gaze into camera", "face": "Soft facial features, hand resting gently on left cheek, slight smile", "makeup": "Rosy pink blush on cheeks and nose, soft pink lip color, small pearl or rhinestone accents placed under the center of each eye" }, "apparel": { "top": "White semi-sheer floral lace top, scoop neckline, ribbon tie front detail, scalloped lace edges, cottagecore/coquette aesthetic", "jewelry": "Gold chain necklace with a puffed heart pendant, simple silver ring on left hand ring finger" }, "pose_and_framing": { "type": "High-angle selfie", "composition": "Medium shot, looking up at the camera, angled slightly downwards", "gesture": "Left hand resting against jawline/cheek" }, "environment": { "setting": "Indoor casual living space", "background_elements": [ "Light wood acoustic guitar hanging on a white wall", "Wicker side table or furniture structure", "Black television remote control resting on wicker surface", "Dark grey textured curtain on the right side" ] }, "lighting_and_style": { "lighting": "Soft, diffused natural daylight coming from the front/side", "aesthetic": "Soft girl, coquette, casual, candid, social media snapshot style" } } }3e:T664,{ "image_prompt_data": { "subject": { "demographics": "Young wo

Stylized 3D Animated Family Portrait
Create a vertical 4:5 (aspect ratio) premium stylized 3D animated family portrait (art style) using the uploaded photo as reference. Keep every family member clearly recognizable through facial features, hairstyle, glasses, age, expressions, and overall vibe while transforming them into a polished cute high-end 3D animation style. Use soft cinematic shading, detailed hair strands, realistic fabric textures, and warm natural skin tones. Avoid plastic, toy-like, or cheap CGI looks. Use a strict elevated high-angle camera view from above, tilted downward toward the family. Faces and upper bodies should appear slightly larger while the full bodies taper naturally toward the feet. Keep everyone head-to-toe and large in the frame with minimal empty space. Arrange the family in a tight affectionate cluster with joyful candid energy. Add playful hand-drawn doodles around the family using dark brown line art with hearts, stars, and spark marks. (additional elements)
Explore more prompts
Browse more AI image and video prompts by category.
FAQ
- How does the prompt keep the person consistent with the reference image?
- It explicitly requires preserving the exact facial features, eyes, hairstyle, hair color, age appearance, and all outfit details including color, shape, material, and decoration without any changes.
- Why show the Scramble Crossing with numerous pedestrians and no empty center?
- The instructions place wide crosswalks, the intersection center, and a dense crowd of walkers directly behind the figure to capture the full energy of Shibuya without sparse areas or generic street views.