Publications | Yan Luo

2023

GUI: A Comprehensive Dataset of Global Urban Infrastructure Based on Geospatial Visual Foundation Models

Zhenyu Han, Xin Zhang, Xi Yanxin, and 3 more authors

Under review (at The Web Conference 2024), 2023

Abs

The rapid urbanization process has led to the concentration of more than half of the world’s population in cities, placing significant strain on urban infrastructure. However, the substantial social and financial costs of infrastructure data collection impede in-depth analyses towards a sustainable urban design, especially in developing countries. In this paper, we present a comprehensive dataset with interactive web visualization that encompasses urban infrastructure information from 1178 cities worldwide, covering 93,088 \mathrmkm^2 areas. Utilizing recent advances in web technology and geospatial visual foundation models, 20 distinct categories of urban infrastructure are identified, ranging from transportation, art & sports, and industrial facilities. The proposed dataset indicates that global urban infrastructure is concentrated in a surprisingly high proportion in high-income countries, whose infrastructure per capita is 18.74 times higher than that in low-income countries. As the first global-scale comprehensive infrastructure dataset, it sheds light on the sustainable development of cities and exposes the stark inequity in urban infrastructure provision for vulnerable populations.
Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images

Ye Liu, Huifang Li, Chao Hu, and 3 more authors

IEEE Transactions on Neural Networks and Learning Systems, 2023

Abs HTML

The task of instance segmentation in remote sensing images, aiming at performing per-pixel labeling of objects at instance level, is of great importance for various civil applications. Despite previous successes, most existing instance segmentation methods designed for natural images encounter sharp performance degradations when they are directly applied to top-view remote sensing images. Through careful analysis, we observe that the challenges mainly come from the lack of discriminative object features due to severe scale variations, low contrasts, and clustered distributions. In order to address these problems, a novel context aggregation network (CATNet) is proposed to improve the feature extraction process. The proposed model exploits three lightweight plug-and-play modules, namely dense feature pyramid network (DenseFPN), spatial context pyramid (SCP), and hierarchical region of interest extractor (HRoIE), to aggregate global visual context at feature, spatial, and instance domains, respectively. DenseFPN is a multi-scale feature propagation module that establishes more flexible information flows by adopting inter-level residual connections, cross-level dense connections, and feature re-weighting strategy. Leveraging the attention mechanism, SCP further augments the features by aggregating global spatial context into local regions. For each instance, HRoIE adaptively generates RoI features for different downstream tasks. Extensive evaluations of the proposed scheme on iSAID, DIOR, NWPU VHR-10, and HRSID datasets demonstrate that the proposed approach outperforms state-of-the-arts under similar computational costs.
Timestamps as Prompts for the Geography-Aware Location Recommendation

Yan Luo, Haoyi Duan, Ye Liu, and 1 more author

In Proceedings of the 32st ACM International Conference on Information & Knowledge Management (CIKM 2023), 2023

Abs HTML

Location recommendation plays a vital role in improving users’ travel experience. The timestamp of the POI to be predicted is of great significance, since a user will go to different places at different times. However, most existing methods either do not use this kind of temporal information, or just implicitly fuse it with other contextual information. In this paper, we revisit the problem of location recommendation and point out that explicitly modeling temporal information is a great help when the model needs to predict not only the next location but also further locations. In addition, state-of-the-art methods do not make effective use of geographic information and suffer from the hard boundary problem when encoding geographic information by gridding. To this end, a Temporal Prompt-based and Geography-aware (TPG) framework is proposed. The temporal prompt is firstly designed to incorporate temporal information of any further check-in. A shifted window mechanism is then devised to augment geographic data for addressing the hard boundary problem. Via extensive comparisons with existing methods and ablation studies on five real-world datasets, we demonstrate the effectiveness and superiority of the proposed method under various settings. Most importantly, our proposed model has the superior ability of interval prediction. In particular, the model can predict the location that a user wants to go to at a certain time while the most recent check-in behavioral data is masked, or it can predict specific future check-in (not just the next one) at a given timestamp.
End-to-End Personalized Next Location Recommendation via Contrastive User Preference Modeling

Yan Luo, Ye Liu, Fu-Lai Chung, and 3 more authors

Under Review (at AAAI 2023), 2023

Abs HTML

Predicting the next location is a highly valuable and common need in many location-based services such as destination prediction and route planning. The goal of next location recommendation is to predict the next point-of-interest a user might go to based on user’s historical trajectory. Most existing models learn mobility patterns merely from users’ historical check-in sequences while overlooking the significance of user preference modeling. In this work, a novel Point-of-Interest Transformer (POIFormer) with contrastive user preference modeling is developed for end-to-end next location recommendation. This model consists of three major modules: history encoder, query generator, and preference decoder. History encoder is designed to model mobility patterns from historical check-in sequences, while query generator explicitly learns user preferences to generate user-specific intention queries. Finally, preference decoder combines the intention queries and historical information to predict the user’s next location. Extensive comparisons with representative schemes and ablation studies on four real-world datasets demonstrate the effectiveness and superiority of the proposed scheme under various settings.

2022

Urban Region Profiling via A Multi-Graph Representation Learning Framework

Yan Luo, Fu-Lai Chung, and Kai Chen

In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (CIKM 2022), 2022

Abs HTML

Urban region profiling can benefit urban analytics. Although existing studies have made great efforts to learn urban region representation from multi-source urban data, there are still three limitations: (1) Most related methods focused merely on global-level inter-region relations while overlooking local-level geographical contextual signals and intra-region information; (2) Most previous works failed to develop an effective yet integrated fusion module which can deeply fuse multi-graph correlations; (3) State-of-the-art methods do not perform well in regions with high variance socioeconomic attributes. To address these challenges, we propose a multi-graph representative learning framework, called Region2Vec, for urban region profiling. Specifically, except that human mobility is encoded for inter-region relations, geographic neighborhood is introduced for capturing geographical contextual information while POI side information is adopted for representing intra-region information by knowledge graph. Then, graphs are used to capture accessibility, vicinity, and functionality correlations among regions. To consider the discriminative properties of multiple graphs, an encoder-decoder multi-graph fusion module is further proposed to jointly learn comprehensive representations. Experiments on real-world datasets show that Region2Vec can be employed in three applications and outperforms all state-of-the-art baselines. Particularly, Region2Vec has better performance than previous studies in regions with high variance socioeconomic attributes.
Geo-Tile2Vec: A Multi-Modal and Multi-Stage Embedding Framework for Urban Analytics

Yan Luo, Chak-Tou Leong, Shuhai Jiao, and 3 more authors

ACM Transactions on Spatial Algorithms Systems, 2022

Abs HTML

Cities are very complex systems. Representing urban regions are essential for exploring, understanding, and predicting properties and features of cities. The enrichment of multi-modal urban big data has provided opportunities for researchers to enhance urban region embedding. However, existing works failed to develop an integrated pipeline that fully utilizes effective and informative data sources within geographic units. In this paper, we regard a geo-tile as a geographic unit and propose a multi-modal and multi-stage representation learning framework, namely Geo-Tile2Vec, for urban analytics, especially for urban region properties identification. Specifically, in the early stage, geo-tile embeddings are firstly inferred through dynamic mobility events which are combinations of point-of-interest (POI) data and trajectory data by a Word2Vec-like model and metric learning. Then, in the latter stage, we use static street-level imagery to further enrich the embedding information by metric learning. Lastly, the framework learns distributed geo-tile embeddings for the given multi-modal data. We conduct experiments on real-world urban datasets. Four downstream tasks, i.e., main POI category classification task, main land use category classification task, restaurant average price regression task, and firm number regression task, are adopted for validating the effectiveness of the proposed framework in representing geo-tiles. Our proposed framework can significantly improve the performances of all downstream tasks. In addition, we also demonstrate that geo-tiles with similar urban region properties are geometrically closer in the vector space.
A Multi-Dimensional City Data Embedding Model for Improving Predictive Analytics and Urban Operations

Zhe Jing*, Yan Luo*, Xiaotong Li, and 1 more author

Industrial Management & Data Systems, 2022

Abs HTML

Smart city is a potential solution to the problems caused by the unprecedented speed of urbanization. However, the increasing availability of big data is a challenge for transforming a city into a smart one. Conventional statistics and econometric methods may not work well with big data. One promising direction is to leverage advanced machine learning tools in analyzing big data about cities. In this paper, authors propose a model to learn region embedding. The learned embedding can be used for more accurate prediction by representing discrete variables as continuous vectors that encode the meaning of a region. Specifically, we use the random walk and skip-gram methods to learn embedding and update the preliminary embedding generated by graph convolutional network (GCN). We further apply this model to a real-world dataset from Manhattan, New York, and use the learned embedding for crime event prediction. The results show that the proposed model can learn multi-dimensional city data more accurately. Thus, it facilitates cities to transform themselves into smarter ones that are more sustainable and efficient.

2021

Characterizing Tourist Daily Trip Chains Using Mobile Phone Big Data

Yan Luo

arXiv preprint, 2021

Abs HTML

Tourists tend to visit multiple destinations out of their variety-seeking motivations in their trips. Thus, it is critical to discover travel patterns involving multi-destinations in tourism research. Existing relevant research most relied on survey data or focused on citizens due to the lack of large-scale, fine-grained tourism datasets. Several scholars have mentioned the notion of trip chains, but few works have been done towards quantitatively identifying the structures of trip chains. In this paper, we propose a model for quantitatively characterizing tourist daily trip chains. After applying this model to tourist mobile phone big data, underlying tourist travel patterns are discovered. Through the framework, we find that: (1) Most "hybrid" (inter-city and intra-city) and "intra-city" (only intra-city) patterns can be captured by only 13 key trip chains relatively; (2) For two continuous days, almost all kinds of original chains have a rather high probability to transfer to either the first two transferred chains, or other infrequent chains in our study areas; (3) The principle of least efforts (PLE) affects tourists’ structures of trip chains. We can use average degree and average travel distance to interpret tourist travel behavior (achieving tasks in PLE). This study not only demonstrate the complex daily travel trip chains from tourism big data, but also fill the gap in tourism literature on multi-destination trips by discovering significant and underlying patterns based on mobile datasets.

2020

Road Network Extraction from GPS Trajectories – A Tensor Voting Based Algorithm

Yan Luo, Longgang Xiang, Yang Xu, and 1 more author

In Proceedings of the 28th Geographical Information Science Research UK Conference (GISRUK 2020), 2020

Abs HTML

This paper introduces a tensor voting based algorithm for automatic road extraction from GPS trajectories. By performing the algorithm over three selected sites in Wuhan, China, the experimental results show that the proposed method can extract comprehensive road networks by effectively identifying intersections and road sections. The algorithm has a good anti-noise effect and remains robust across the three experimental sites.