街道视觉智能在城市规划领域的分析范式转型
Paradigm Shift of Street Visual Intelligence in Urban Planning
尹瀚玙
北京大学地球与空间科学学院遥感与地理信息 系统研究所 博士研究生
孙玉梅
石家庄铁路职业技术学院 副教授,硕士生导师
邬 伦
北京大学地球与空间科学学院遥感与地理信息 系统研究所 教授,博士生导师
张 帆(通信作者)
北京大学地球与空间科学学院遥感与地理信息 系统研究所 研究员,博士生导师 fanzhanggis@pku.edu.cn
摘要: 街道作为城市的基本单元,其视觉特征不仅直观反映了城市空间的品质,还深刻影响着居民的生活福祉。街景图像以人 的视角反映街道场景,为微观层面的城市环境分析提供了独特视角,成为城市规划领域不可或缺的数据。系统梳理街景 数据在城市规划中的应用历程,重点分析其与人工智能技术发展的紧密关联。从早期的传统手工审计,到基于统计机器 学习的特征工程,再到深度学习驱动的自动化分析,街景数据的利用效率和分析精度得到了极大提升。近年来,自监督 学习和大语言模型的应用,进一步拓展了街景数据的潜力,使其能够支持更复杂的城市分析任务。通过回顾街景数据在 不同技术阶段的应用,揭示了人工智能如何重塑其分析范式,并展望其在未来城市规划建设中的潜力。
Abstract: Streets serve as the foundational elements of urban environments. Their visual characteristics directly reflect urban spatial quality and influence residents' well-being. Street view imagery captures street scenes from a human perspective, providing a unique viewpoint for micro-level urban environment analysis and serving as an indispensable resource for urban planning. This study systematically reviews the development of street view image applications in urban planning, especially focusing on their integration with advancing artificial intelligence (AI) techniques. From early manual audits to feature extraction powered by statistical machine learning, and later to automated analysis driven by deep learning, the efficiency and accuracy of street view images utilization have markedly improved. In recent years, the incorporation of self-supervised learning and large language models has markedly enhanced the application potential of street view images, enabling more complex urban analysis compared to earlier approaches. By reviewing the application of street view data across different technological stages, this study illustrates how artificial intelligence has reshaped its analytical paradigms and explores its promising potential for future urban planning.
关键词:街景图像;人工智能;城市规划;分析范式
Keyword: street view images; artificial intelligence; urban planning; analytical paradigm
中图分类号:TU981
文献标识码: A
张帆,刘瑜. 街景影像——基于人工智能的方
法与应用[J]. 遥感学报,2021,25(5):1043-
1054.
ZHANG Fan, LIU Yu. Street view images: methods
and applications based on artificial intelligence[J]. Journal of Remote Sensing, 2021, 25(5): 1043-
1054.
姜鹏,倪砼,郗望. 面向未来的DAD与智慧城市
[J]. 上海城市规划,2016(3):52-55.
JIANG Peng, NI Tong, XI Wang. DAD and smart
cities for the future[J]. Shanghai Urban Planning
Review, 2016(3): 52-55.
金盛,郭文彤,江杨,等. 街景影像在城市交通研
究中的应用:回顾、分析和展望[J]. 交通运输工
程与信息学报,2024,22(2):191-209.
JIN Sheng, GUO Wentong, JIANG Yang, et al. Applications of street view images in urban transportation research: review, analysis, and prospect[J]. Journal of Transportation Engineering
and Information, 2024, 22(2): 191-209.
陈虹,汤军,龚阳春,等. 基于街景与高分遥感影
像的超大城市绿地高精度识别与空间特征解析
[J]. 地球信息科学学报,2024,26(12):2818-
2830.
CHEN Hong, TANG Jun, GONG Yangchun, et al. High-precision identification and spatial
characterization of green spaces in megacities based on street view and high-resolution remote sensing images[J]. Journal of Geo-Information Science, 2024, 26(12): 2818-2830.
李心雨,闫浩文,王卓,等. 街景图像与机器学习
相结合的道路环境安全感知评价与影响因素分
析[J]. 地球信息科学学报,2023,25(4):852-
865.
LI Xinyu, YAN Haowen, WANG Zhuo, et al. Evaluation of road environmental safety perception
and influencing factors based on the integration of street view images and machine learning[J]. Journal
of Geo-Information Science, 2023, 25(4): 852-865.
冯叶涵,陈亮,贺晓冬. 基于百度街景的SVF计算
及其在城市热岛研究中的应用[J]. 地球信息科
学学报,2021,23(11):1998-2012.
FENG Yehan, CHEN Liang, HE Xiaodong. SVF calculation based on Baidu Street View and its application in urban heat island research[J]. Journal of Geo-Information Science, 2021, 23(11):
1998-2012.
周垠,龙瀛. 街道步行指数的大规模评价——方
法改进及其成都应用[J]. 上海城市规划,2017
(1):88-93.
ZHOU Yin, LONG Ying. Large-scale evaluation of walkability index: method improvement and
its application in Chengdu[J]. Shanghai Urban
Planning Review, 2017(1): 88-93.
张昊,尹力. 大数据在评价有关公共健康的建
成环境中的应用:文献综述[J]. 上海城市规划,
2020(5):36-40.
ZHANG Hao, YIN Li. Application of big data in evaluating built environment related to public
health: a literature review[J]. Shanghai Urban
Planning Review, 2020(5): 36-40.
WHYTE W H. The social life of small urban spaces[M]. Washington, DC: Conservation Founda-
tion, 1980.
JACOBS J. Death and life of great American cities[M]. New York: Random House, 1961.
LYNCH K. The image of the city[M]. Cambridge,
MA: MIT Press, 1960.
RUNDLE A G, BADER M D M, RICHARDS C A, et al. Using Google Street View to audit neighbor-
hood environments[J]. American Journal of Preven-
tive Medicine, 2011, 40(1): 94-100.
BADLAND H M, OPIT S, WITTEN K, et al. Can
virtual streetscape audits reliably replace physical
streetscape audits?[J]. Journal of Urban Health, 2010, 87(6): 1007-1016.
NAIK N, PHILIPOOM J, RASKAR R, et al.
Streetscore - predicting the perceived safety of one
million streetscapes[C]//2014 IEEE Conference
on Computer Vision and Pattern Recognition Workshops. Columbus, OH, USA: IEEE, 2014: 793-799.
LI X, ZHANG C, LI W, et al. Assessing street-
level urban greenery using Google Street View and a modified green view index[J]. Urban Forestry &
Urban Greening, 2015, 14(3): 675-685.
LI X, ZHANG C, LI W. Building block level urban land-use information retrieval based on
Google Street View images[J]. GIScience & Remote Sensing, 2017, 54(6): 819-835.
MURILLO A C, SINGH G, KOSECKÁ J, et al. Localization in urban environments using a
panoramic gist descriptor[J]. IEEE Transactions on Robotics, 2013, 29(1): 146-160.
CAMPBELL A, BOTH A, SUN Q. Detecting and mapping traffic signs from Google Street View images using deep learning and GIS[J]. Computers, Environment and Urban Systems, 2019,
77: 101350.
DAI Y, LIU L, WANG K, et al. Using computer vision and street view images to assess bus stop
amenities[J]. Computers, Environment and Urban Systems, 2025, 117: 102254.
PENG X, SONG R, CAO Q, et al. Real-time illegal parking detection algorithm in urban environments[J]. IEEE Transactions on Intelligent
Transportation Systems, 2022, 23(11): 20572-
20587.
FAN Z, ZHANG F, LOO B P Y, et al. Urban visual intelligence: uncovering hidden city profiles
with street view images[J]. Proceedings of the National Academy of Sciences, 2023, 120(27): e2220417120.
ZHANG F, ZHANG D, LIU Y, et al. Representing
place locales using scene elements[J]. Computers,
Environment and Urban Systems, 2018, 71: 153-
164.
李海薇,陈崇贤,刘欣宜,等. 人视街景图像和机
器学习结合的城市街道适老性水平空间效应
研究[J]. 地球信息科学学报,2024,26(6):
1469-1485.
LI Haiwei, CHEN Chongxian, LIU Xinyi, et al.
Exploring the spatial effects of urban street age-
friendliness using human-perspective street view
images and machine learning[J]. Journal of Geo-
Information Science, 2024, 26(6): 1469-1485.
ZHANG F, ZHOU B, LIU L, et al. Measuring human perceptions of a large-scale urban region using machine learning[J]. Landscape and Urban
Planning, 2018, 180: 148-160.
HUANG J, FEI T, KANG Y, et al. Estimating urban noise along road network from street view imagery[J]. International Journal of Geographical
Information Science, 2024, 38(1): 128-155.
ZHANG F, WU L, ZHU D, et al. Social sensing from street-level imagery: a case study in learning
spatio-temporal urban mobility patterns[J]. ISPRS
Journal of Photogrammetry and Remote Sensing,
2019, 153: 48-58.
KANG J, KÖRNER M, WANG Y, et al. Building
instance classification using street view images[J].
ISPRS Journal of Photogrammetry and Remote Sensing, 2018, 145: 44-59.
YAO Y, DONG A, LIU Z, et al. Extracting the pickpocketing information implied in the built
environment by treating it as the anomalies[J]. Cities, 2023, 143: 104575.
司睿,林姚宇,肖作鹏,等. 基于街景数据的建成
环境与街道活力时空分析——以深圳福田区为
例[J]. 地理科学,2021,41(9):1536-1545.
SI Rui, LIN Yaoyu, XIAO Zuopeng, et al. Spatiotemporal analysis of the built environment and street vitality based on street view data: a case
study of Futian District, Shenzhen[J]. Scientia Geographica Sinica, 2021, 41(9): 1536-1545.
GUI J, CHEN T, ZHANG J, et al. A survey on self-supervised learning: algorithms, applications, and future trends[J]. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 2024, 46(12): 9052-9071.
WANG Z, LI H, RAJAGOPAL R. Urban2Vec: incorporating street view imagery and POIs for multi-modal urban neighborhood embedding[J]. Proceedings of the AAAI Conference on Artificial
Intelligence, 2020, 34(1): 1013-1020.
LI Y, HUANG Y, MAI G, et al. Learning street view representations with spatiotemporal con-
trast[J]. arXiv, 2025: 2502.04638.
LI H, DEUSER F, YIN W, et al. Cross-view geolocalization and disaster mapping with streetview and VHR satellite imagery: a case study of Hurricane IAN[J]. ISPRS Journal of Photo-
grammetry and Remote Sensing, 2025, 220: 841-
854.
LI Y, HUANG W, CONG G, et al. Urban region representation learning with OpenStreetMap building footprints[C]//Proceedings of the 29th ACM SIGKDD Conference on Knowledge
Discovery and Data Mining. New York, NY, USA:
Association for Computing Machinery, 2023: 1363-1373.
WANG J, HUANG W, BILJECKI F. Learning visual features from figure-ground maps for urban
morphology discovery[J]. Computers, Environment
and Urban Systems, 2024, 109: 102076.
QI M, HANKEY S. Using street view imagery
to predict street-level particulate air pollution[J]. Environmental Science & Technology, 2021, 55(4): 2695-2704.
SWERDLOW A, XU R, ZHOU B. Street-view image generation from a bird's-eye view layout[J].
IEEE Robotics and Automation Letters, 2024, 9(4): 3578-3585.
PANG H E, BILJECKI F. 3D building recon-
struction from single street view images using
deep learning[J]. International Journal of Applied Earth Observation and Geoinformation, 2022, 112: 102859.
LIU Z, LI T, REN T, et al. Day-to-night street view image generation for 24-hour urban scene auditing
using generative AI[J]. Journal of Imaging, 2024, 10(5): 112.
HOU C, ZHANG F, LI Y, et al. Urban sensing in the era of large language models[J]. The Innovation, 2025, 6(1): 100749.
周静,邝远霄,刘勇,等. 基于大语言模型的城市
街区“人—地”情感关联测度及启示——以上
海多伦路街区为例[J]. 上海城市规划,2024(3):
102-108.
ZHOU Jing, KUANG Yuanxiao, LIU Yong, et al. Measuring human–place emotional connections
in urban blocks based on large language models: a
case study of Duolun Road Block in Shanghai[J].
Shanghai Urban Planning Review, 2024(3): 102-
108.
JANG K M, KIM J. Multimodal large language models as built environment auditing tools[J]. The
Professional Geographer, 2025, 77(1): 84-90.
黄铎,萧蕾. 大语言模型赋能城市规划的应用分
析[J]. 房地产世界,2024(23):1-4.
HUANG Duo, XIAO Lei. Application analysis of large language models empowering urban planning[J]. Real Estate World, 2024(23): 1-4.
HUANG W, WANG J, CONG G. Zero-shot urban
function inference with street view images through
prompting a pretrained vision-language model[J].
International Journal of Geographical Information
Science, 2024, 38(7): 1414-1442.
WU M, HUANG Q, GAO S, et al. Mixed land use
measurement and mapping with street view images and spatial context-aware prompts via zero-shot
multimodal learning[J]. International Journal of
Applied Earth Observation and Geoinformation,
2023, 125: 103591.
CHEN M, LI Z, HUANG W, et al. Profiling urban
streets: a semi-supervised prediction model based on street view imagery and spatial topology[C]//
Proceedings of the 30th ACM SIGKDD Confer-
ence on Knowledge Discovery and Data Mining.
New York, NY, USA: Association for Computing Machinery, 2024: 319-328.
XU S, ZHANG C, FAN L, et al. AddressCLIP: empowering vision-language models for city-
wide image address localization[C]//LEONARDIS
A, RICCI E, ROTH S, et al. Computer Vision – ECCV 2024. Cham: Springer Nature Switzerland,
2025: 76-92.
LIANG H, ZHANG J, LI Y, et al. Automatic estimation for visual quality changes of street space via street-view images and multimodal large language models[J]. IEEE Access, 2024, 12: 87713-
87727.
ZHOU Z, WANG Q, LIN B, et al. UNIAA: a unified multi-modal image aesthetic assessment
baseline and benchmark[J]. arXiv, 2024: 2404.09619.
BLEČIĆ I, SAIU V, TRUNFIO GIUSEPPE A. Enhancing urban walkability assessment with multimodal large language models[C]//
GERVASI O, MURGANTE B, GARAU C, et al. Computational Science and Its Applications – ICCSA 2024 Workshops. Cham: Springer Nature Switzerland, 2024: 394-411.
YU D, BAO R, MAI G, et al. Spatial-RAG: spatial retrieval augmented generation for realworld spatial reasoning questions[J]. arXiv, 2025: 2502.18470.