ほぼコピペですが見てみます。 <a href="https://www.kaggle.com/code/alexisbcook/manipulating-geospatial-data" rel="noopener noreferrer" target="_blank">https://www.kaggle.com/code/alexisbcook/manipulating-geospatial-data</a> <pre class="ql-syntax" spellcheck="false">import pandas as pd
import geopandas as gpd
import numpy as np
import folium
from folium import Marker
import warnings 
warnings.filterwarnings(&#39;ignore&#39;)
</pre> 導入このチュートリアルでは、地理空間データのための2つの一般的な操作、ジオコーディングとテーブル結合について学びます。 ジオコーディングジオコーディングとは、場所の名前や住所を地図上の位置に変換するプロセスです。たとえば、Google Maps、Bing Maps、Baidu Mapsなどを使用してランドマークの説明に基づいて地理的な場所を検索したことがあれば、ジオコーダを使用したことになります！ すべてのジオコーディングにはgeopyを使用します。 <pre class="ql-syntax" spellcheck="false">from geopy.geocoders import Nominatim
</pre> 上記のコードセルでは、Nominatimは位置情報を生成するために使用されるジオコーディングソフトウェアを指します。 まず、ジオコーダをインスタンス化してから、名前または住所をPython文字列として提供する必要があります（この場合、 &#34;Pyramid of Khufu&#34;またはGreat Pyramid of Gizaとして提供します）。 ジオコーディングが成功した場合、重要な2つの属性を含むgeopy.location.Locationオブジェクトが返されます： - &#34;point&#34;属性には（緯度、経度）の場所が含まれ、- &#34;address&#34;属性には完全な住所が含まれます。 <pre class="ql-syntax" spellcheck="false">geolocator = Nominatim(user_agent=&#34;kaggle_learn&#34;)
location = geolocator.geocode(&#34;Pyramid of Khufu&#34;)

print(location.point)
print(location.address)
29 58m 44.976s N, 31 8m 3.17625s E
هرم خوفو, شارع ابو الهول السياحي, نزلة البطران, الجيزة, 12125, مصر
</pre> &#34;point&#34;属性の値はgeopy.point.Pointオブジェクトであり、緯度と経度はそれぞれlatitudeとlongitude属性から取得できます。 <pre class="ql-syntax" spellcheck="false">point = location.point
print(&#34;Latitude:&#34;, point.latitude)
print(&#34;Longitude:&#34;, point.longitude)
Latitude: 29.97916
Longitude: 31.134215625236113
</pre> 多くの異なる住所をジオコーディングする必要があることはよくあります。たとえば、ヨーロッパのトップ100大学の場所を取得したい場合などです。 <pre class="ql-syntax" spellcheck="false">universities = pd.read_csv(&#34;../input/geospatial-learn-course-data/top_universities.csv&#34;)
universities.head()
</pre> その後、ラムダ関数を使用してDataFrame内のすべての行にジオコーダを適用できます（ジオコーディングが失敗する場合を考慮して、try/exceptステートメントを使用しています）。 <pre class="ql-syntax" spellcheck="false">def my_geocoder(row):
 try:
 point = geolocator.geocode(row).point
 return pd.Series({&#39;Latitude&#39;: point.latitude, &#39;Longitude&#39;: point.longitude})
 except:
 return None

universities[[&#39;Latitude&#39;, &#39;Longitude&#39;]] = universities.apply(lambda x: my_geocoder(x[&#39;Name&#39;]), axis=1)

print(&#34;{}% of addresses were geocoded!&#34;.format(
 (1 - sum(np.isnan(universities[&#34;Latitude&#34;])) / len(universities)) * 100))

# Drop universities that were not successfully geocoded
universities = universities.loc[~np.isnan(universities[&#34;Latitude&#34;])]
universities = gpd.GeoDataFrame(
 universities, geometry=gpd.points_from_xy(universities.Longitude, universities.Latitude))
universities.crs = {&#39;init&#39;: &#39;epsg:4326&#39;}
universities.head()
91.0% of addresses were geocoded!
</pre> 次に、ジオコーダによって返されたすべての場所を可視化します。いくつかの場所は明らかに正確でなく、ヨーロッパに存在しないことがあります。 <pre class="ql-syntax" spellcheck="false"># Create a map
m = folium.Map(location=[54, 15], tiles=&#39;openstreetmap&#39;, zoom_start=2)

# Add points to the map
for idx, row in universities.iterrows():
 Marker([row[&#39;Latitude&#39;], row[&#39;Longitude&#39;]], popup=row[&#39;Name&#39;]).add_to(m)

# Display the map
m
</pre> テーブル結合次に、異なるソースからのデータを組み合わせる方法について考えます。 属性結合共有インデックスを持つ複数のDataFrameから情報を組み合わせる方法として、pd.DataFrame.join()を使用する方法をすでに知っています。これは、データを結合する方法（単にインデックスの値を一致させる）として属性結合と呼ばれます。 GeoDataFrameで属性結合を実行する場合、gpd.GeoDataFrame.merge()を使用するのが最適です。これを説明するために、ヨーロッパの各国の境界を含むGeoDataFrameであるeurope_boundariesを使用します。このGeoDataFrameの最初の5行は以下に示されています。 <pre class="ql-syntax" spellcheck="false">world = gpd.read_file(gpd.datasets.get_path(&#39;naturalearth_lowres&#39;))
europe = world.loc[world.continent == &#39;Europe&#39;].reset_index(drop=True)

europe_stats = europe[[&#34;name&#34;, &#34;pop_est&#34;, &#34;gdp_md_est&#34;]]
europe_boundaries = europe[[&#34;name&#34;, &#34;geometry&#34;]]
</pre>In [9]: <pre class="ql-syntax" spellcheck="false">europe_boundaries.head()
</pre> このGeoDataFrameを、各国の推定人口と国内総生産（GDP）を含むDataFrameであるeurope_statsと結合します。 <pre class="ql-syntax" spellcheck="false">europe_stats.head()
</pre>Out[10]: 以下のコードセルで属性結合を行います。on引数は、europe_boundariesの行をeurope_statsの行に一致させるために使用される列名に設定されています。 <pre class="ql-syntax" spellcheck="false"># Use an attribute join to merge data about countries in Europe
europe = europe_boundaries.merge(europe_stats, on=&#34;name&#34;)
europe.head()
</pre> もう1つの結合のタイプは、空間結合です。空間結合では、&#34;geometry&#34;列内のオブジェクト間の空間関係に基づいてGeoDataFrameを組み合わせます。たとえば、すでにヨーロッパの大学のジオコーディングされた住所を含むGeoDataFrame universitiesがあるとします。 その後、各大学を対応する国に一致させるために、空間結合を使用できます。これはgpd.sjoin()を使用して行います。 <pre class="ql-syntax" spellcheck="false"># Use spatial join to match universities to countries in Europe
european_universities = gpd.sjoin(universities, europe)

# Investigate the result
print(&#34;We located {} universities.&#34;.format(len(universities)))
print(&#34;Only {} of the universities were located in Europe (in {} different countries).&#34;.format(
 len(european_universities), len(european_universities.name.unique())))

european_universities.head()
We located 91 universities.
Only 87 of the universities were located in Europe (in 14 different countries).
</pre> 上記の空間結合は、両方のGeoDataFrameの&#34;geometry&#34;列を見ます。universities GeoDataFrameのPointオブジェクトがeurope DataFrameのPolygonオブジェクトと交差する場合、対応する行が組み合わされ、european_universities DataFrameの単一の行として追加されます。それ以外の場合、一致する大学のない国（および一致しない国のない大学）は結果から省略されます。 gpd.sjoin()メソッドは、howとop引数を介して異なるタイプの結合にカスタマイズできます。たとえば、how=&#39;left&#39;（またはhow=&#39;right&#39;）を設定することで、SQLの左結合（または右結合）と同等の操作を実行できます。詳細については、ドキュメントを参照して詳細を学ぶことができます。

Kaggleのイントロを見る。。。地理空間データの操作

Yuichiro Minato