Amtrak real time train data
Recently the US passenger rail operator Amtrak announced a nationwide real-time map of its services.
Thats great...until you try it.
The United States is big, even though the number of trains is fairly small. The number of Google map tiles is big. The polygons that show the routes are enormous. The nationwide train data is big.
Thats a lot of data to load --and so slow to display-- if all I really want is realtime data on a single train (is my train late?).
After looking at the GETs emitted by all the (slow, unnecessary) javascript on that site, I figured out how to bypass it and pull just the table of current train status. Its 400kb, but its a lot less than the over 1MB to go through the website and map....
https://www.googleapis.com/mapsengine/v1/tables/01382379791355219452-08584582962951999356/
features?version=published&key=AIzaSyCVFeFQrtk-ywrUE0pEcvlwgCqS6TJcOW4&maxResults=250&
callback=jQuery19106722136430546803_1388112103417&dataType=jsonp&
jsonpCallback=retrieveTrainsData&contentType=application%2Fjson&_=1388112103419
The only fields required are version and key (which is different from the site cookie):
https://www.googleapis.com/mapsengine/v1/tables/01382379791355219452-08584582962951999356/
features?version=published&key=AIzaSyCVFeFQrtk-ywrUE0pEcvlwgCqS6TJcOW4
So we can easily download the current status table using wget:
$ wget -O train_status https://www.googleapis.com/mapsengine/v1/tables/
01382379791355219452-08584582962951999356/features?version=published&
key=AIzaSyCVFeFQrtk-ywrUE0pEcvlwgCqS6TJcOW4&jsonpCallback=retrieveTrainsData
Since parsing 400kb of JSON using shell script will be awkward and annoying, lets change over to Python3:
#/usr/bin/python3
import httplib2
url =
"""https://www.googleapis.com/mapsengine/v1/table
/01382379791355219452-08584582962951999356/features?version=published&
key=AIzaSyCVFeFQrtk-ywrUE0pEcvlwgCqS6TJcOW4&jsonpCallback=retrieveTrainsData"""
h = httplib2.Http(None)
resp, content = h.request(url, "GET")
Lets tell Python that the content is a JSON string, and iterate through the list of trains looking for train number 7. Thats a handy train because each run takes over 24 hours - it will always have a status.
import json
all_trains = json.loads(content.decode("UTF-8"))
for train in all_trains["features"]:
if train["properties"]["TrainNum"] == "7":
do something
After locating the train we care about, here is data about it:
latitude = train["geometry"]["coordinates"][0]
longitude = train["geometry"]["coordinates"][1]
speed = train["properties"]["Velocity"]
report_time = train["properties"]["LastValTS"]
next_station = train["properties"]["EventCode"]
Lets find a station (FAR) along the route. This is a little tougher, because Pythons JSON module reads each Station as a string instead of a dict. We need to identify Station lines, and feed those through the JSON parser (again).
for prop in train["properties"].keys():
if "Station" in prop:
sta = json.loads(train["properties"][prop])
if sta["code"] == "FAR":
# Schedule data
station_time_zone = sta[tz]
scheduled_arrival_time = sta[scharr] # Only if a long stop
scheduled_departure_time = sta[schdep] # All stations
# Past stations
actual_arrival_time = sta[postarr]
actual_departure_time = sta[postdep]
past_station_status = sta[postcmnt]
# Future stations
estimated_arrival_time = sta[estarr]
future_station_status = sta[estarrcmnt]
The final Python script for Train 7 at FAR is:
#/usr/bin/python3
import httplib2
import json
url = "https://www.googleapis.com/mapsengine/v1/tables/01382379791355219452-08584582962951999356/features?version=published&key=AIzaSyCVFeFQrtk-ywrUE0pEcvlwgCqS6TJcOW4&jsonpCallback=retrieveTrainsData"
h = httplib2.Http(None)
resp, content = h.request(url, "GET")
all_trains = json.loads(content.decode("UTF-8"))
for train in all_trains["features"]:
if train["properties"]["TrainNum"] == "7":
for prop in train["properties"].keys():
if "Station" in prop:
sta = json.loads(train["properties"][prop])
if sta["code"] == "FAR":
print(Train number {}.format(train["properties"]["TrainNum"]))
if schdep in sta.keys():
print(Scheduled: {}.format(sta[schdep]))
if postcmnt in sta.keys():
print("Status: {}".format(sta[postcmnt]))
if estarrcmnt in sta.keys():
print("Status: {}".format(sta[estarrcmnt]))
And the result looks rather like:
$ python3 Current train status.py
Train number 7
Scheduled: 12/26/2013 03:35:00
Status: 1 HR 4 MI LATE
Train number 7
Scheduled: 12/27/2013 03:35:00
Two more items.
First, the Python script can be tweaked to show *all* trains for a station. Just remove the train number filter. To be useful, you might add some time comprehension, since some scheduled times can be a day in the future or past.
Second, the station information table has a similar static page: https://www.googleapis.com/mapsengine/v1/tables/01382379791355219452-17620014524089761219/features?&version=published&maxResults=1000&key=AIzaSyCVFeFQrtk-ywrUE0pEcvlwgCqS6TJcOW4&callback=jQuery19106722136430546803_1388112103417&dataType=jsonp&jsonpCallback=retrieveStationData&contentType=application%2Fjson&_=13881121034
As with the train status JSON application, many of the fields are dummies. This shorter URL works, too: https://www.googleapis.com/mapsengine/v1/tables/01382379791355219452-17620014524089761219/features?&version=published&key=AIzaSyCVFeFQrtk-ywrUE0pEcvlwgCqS6TJcOW4&callback=jQuery19106722136430546803_1388112103417
alternative link download