AWS EC2 metadata fetcher in Python
Clash Royale CLAN TAG#URR8PPP
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;
up vote
4
down vote
favorite
Update
- https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html
This seems to be the schema for the instance metadata, though the way they describe it isn't really handy for someone like me.
Original Post
I wrote a tool to fetch all the EC2 instance metadata available under http://169.254.169.254/latest/ . Can somebody review my code and the underlying heuristics to tell if an output from a particular GET
query is
- a "directory listing" guiding me to more
GET
queries available down the path, or - actual metadata represented by the path ?
For example, if you run these two valid queries,
[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/; echo
dynamic
meta-data
user-data
[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
sg-89...snip...
sg-84...snip...
sg-2e...snip...
your program won't know which of them signifies a directory listing and which actual metadata.
So my strategy is
assume every output line guides me to a new
GET
path and try to throw it to 169.254.169.254if it returned a string whose first line is
{
, it must have been a JSON document, so do pretty-printingif any of the child queries didn't return
HTTP 200
, the original query must have returned an actual metadata, so print it outif any of the child queries returned an identical string as the last path component of the original query (with
HTTP 200
,) we're probably trapped in an infinite loop, and the original result must have not been a directory listing but an actual metadata. E.g.[ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
sg-89...snip...
sg-84...snip...
sg-2...snip...
[ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/sg-89...snip...; echo
sg-89...snip...
sg-84...snip...
sg-2e...snip...
Show below is my code:
#!/usr/bin/env python3
import http
import http.client
import os.path
import json
import pprint
import logging
from typing import *
ADDR = '169.254.169.254'
ADDR = '127.0.0.1:8080' # SSH port forwarding
conn = http.client.HTTPConnection(ADDR, timeout=3.)
def join(a: str, b: str) -> str:
return os.path.join(a, b, '')
def check(parent_path: str, filename: str, s: str) -> None:
# time.sleep(.3)
joined_dir = join(parent_path, filename)
if s is None:
return
if s[0] == '{':
print(joined_dir)
pprint.pprint(json.loads(s))
return
file_candidates = [ l for l in s.split('n') if l ]
tried = [ (c, get(join(joined_dir, c))) for c in file_candidates ]
if all(t is None for c, t in tried) or
all(c in t.split() for c, t in tried if t):
# just a string
print(joined_dir)
print(s)
return
for c, t in tried:
if t:
check(joined_dir, c, t)
def get(path: str) -> Optional[str]:
logging.debug("*** (%r)", path)
conn.request('GET', path)
resp = conn.getresponse()
s: Optional[str] = None
if resp.status == http.HTTPStatus.OK:
s = resp.read().decode().strip()
logging.debug("*** => %s", s)
return s
if '__main__' == __name__:
logging.basicConfig(level=logging.WARN,
format="[%(asctime)s.%(msecs)03d] %(levelname)s %(filename)s:%(lineno)d:%(funcName)s %(message)s",
datefmt="%H:%M:%S")
check('/', 'latest', get('/latest/'))
A typical output looks like this, but I'm not even sure if I really recurse into all the available paths.
/latest/dynamic/instance-identity/document/
'accountId': '...snip...',
'architecture': 'x86_64',
'availabilityZone': 'ap-southeast-1a',
'billingProducts': None,
'devpayProductCodes': None,
'imageId': 'ami-4f89f533',
'instanceId': 'i-...snip...',
'instanceType': 't2.nano',
'kernelId': None,
'marketplaceProductCodes': None,
'pendingTime': '2018-02-06T09:12:43Z',
'privateIp': '172....snip...',
'ramdiskId': None,
'region': 'ap-southeast-1',
'version': '2017-09-30'
/latest/dynamic/instance-identity/pkcs7/
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAaCAJIAEggHcewog
...snip...
FQCvZ++/XNsRExoPCHjmzzC2xKegmgAAAAAAAA==
/latest/dynamic/instance-identity/signature/
GjT4Lzh50Je9glKezoLw32/UnZC+VghBSVav8twUFXDFiEIcy+uwiPaEIWmzBVcR4QlVrknTXVGZ
ups/f9mLmUMoJbjb3ZEMWF4CiB0oUh7FNSUNh25KzMPb4sY15Ay7sDLR30uLJpQfJsUdR/T6nLBG
wYC9WdGlj3SNo+/tre0=
/latest/dynamic/instance-identity/rsa2048/
MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwGggCSABIIB
...snip...
gLVMLECQv+gesZ2j42MPDGFUJ1VzsRQg9mZkOc8AAAAAAAA=
/latest/meta-data/ami-id/
ami-4f89f533
/latest/meta-data/ami-launch-index/
0
/latest/meta-data/ami-manifest-path/
(unknown)
/latest/meta-data/block-device-mapping/ami/
/dev/xvda
/latest/meta-data/block-device-mapping/root/
/dev/xvda
/latest/meta-data/hostname/
ip-...snip....ap-southeast-1.compute.internal
/latest/meta-data/iam/info/
'Code': 'Success',
'InstanceProfileArn': 'arn:aws:iam::...snip...:instance-profile/admin-server-DEV',
'InstanceProfileId': '...snip...',
'LastUpdated': '2018-02-08T01:17:18Z'
...snip...
python python-3.x recursion rest amazon-web-services
add a comment |Â
up vote
4
down vote
favorite
Update
- https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html
This seems to be the schema for the instance metadata, though the way they describe it isn't really handy for someone like me.
Original Post
I wrote a tool to fetch all the EC2 instance metadata available under http://169.254.169.254/latest/ . Can somebody review my code and the underlying heuristics to tell if an output from a particular GET
query is
- a "directory listing" guiding me to more
GET
queries available down the path, or - actual metadata represented by the path ?
For example, if you run these two valid queries,
[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/; echo
dynamic
meta-data
user-data
[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
sg-89...snip...
sg-84...snip...
sg-2e...snip...
your program won't know which of them signifies a directory listing and which actual metadata.
So my strategy is
assume every output line guides me to a new
GET
path and try to throw it to 169.254.169.254if it returned a string whose first line is
{
, it must have been a JSON document, so do pretty-printingif any of the child queries didn't return
HTTP 200
, the original query must have returned an actual metadata, so print it outif any of the child queries returned an identical string as the last path component of the original query (with
HTTP 200
,) we're probably trapped in an infinite loop, and the original result must have not been a directory listing but an actual metadata. E.g.[ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
sg-89...snip...
sg-84...snip...
sg-2...snip...
[ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/sg-89...snip...; echo
sg-89...snip...
sg-84...snip...
sg-2e...snip...
Show below is my code:
#!/usr/bin/env python3
import http
import http.client
import os.path
import json
import pprint
import logging
from typing import *
ADDR = '169.254.169.254'
ADDR = '127.0.0.1:8080' # SSH port forwarding
conn = http.client.HTTPConnection(ADDR, timeout=3.)
def join(a: str, b: str) -> str:
return os.path.join(a, b, '')
def check(parent_path: str, filename: str, s: str) -> None:
# time.sleep(.3)
joined_dir = join(parent_path, filename)
if s is None:
return
if s[0] == '{':
print(joined_dir)
pprint.pprint(json.loads(s))
return
file_candidates = [ l for l in s.split('n') if l ]
tried = [ (c, get(join(joined_dir, c))) for c in file_candidates ]
if all(t is None for c, t in tried) or
all(c in t.split() for c, t in tried if t):
# just a string
print(joined_dir)
print(s)
return
for c, t in tried:
if t:
check(joined_dir, c, t)
def get(path: str) -> Optional[str]:
logging.debug("*** (%r)", path)
conn.request('GET', path)
resp = conn.getresponse()
s: Optional[str] = None
if resp.status == http.HTTPStatus.OK:
s = resp.read().decode().strip()
logging.debug("*** => %s", s)
return s
if '__main__' == __name__:
logging.basicConfig(level=logging.WARN,
format="[%(asctime)s.%(msecs)03d] %(levelname)s %(filename)s:%(lineno)d:%(funcName)s %(message)s",
datefmt="%H:%M:%S")
check('/', 'latest', get('/latest/'))
A typical output looks like this, but I'm not even sure if I really recurse into all the available paths.
/latest/dynamic/instance-identity/document/
'accountId': '...snip...',
'architecture': 'x86_64',
'availabilityZone': 'ap-southeast-1a',
'billingProducts': None,
'devpayProductCodes': None,
'imageId': 'ami-4f89f533',
'instanceId': 'i-...snip...',
'instanceType': 't2.nano',
'kernelId': None,
'marketplaceProductCodes': None,
'pendingTime': '2018-02-06T09:12:43Z',
'privateIp': '172....snip...',
'ramdiskId': None,
'region': 'ap-southeast-1',
'version': '2017-09-30'
/latest/dynamic/instance-identity/pkcs7/
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAaCAJIAEggHcewog
...snip...
FQCvZ++/XNsRExoPCHjmzzC2xKegmgAAAAAAAA==
/latest/dynamic/instance-identity/signature/
GjT4Lzh50Je9glKezoLw32/UnZC+VghBSVav8twUFXDFiEIcy+uwiPaEIWmzBVcR4QlVrknTXVGZ
ups/f9mLmUMoJbjb3ZEMWF4CiB0oUh7FNSUNh25KzMPb4sY15Ay7sDLR30uLJpQfJsUdR/T6nLBG
wYC9WdGlj3SNo+/tre0=
/latest/dynamic/instance-identity/rsa2048/
MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwGggCSABIIB
...snip...
gLVMLECQv+gesZ2j42MPDGFUJ1VzsRQg9mZkOc8AAAAAAAA=
/latest/meta-data/ami-id/
ami-4f89f533
/latest/meta-data/ami-launch-index/
0
/latest/meta-data/ami-manifest-path/
(unknown)
/latest/meta-data/block-device-mapping/ami/
/dev/xvda
/latest/meta-data/block-device-mapping/root/
/dev/xvda
/latest/meta-data/hostname/
ip-...snip....ap-southeast-1.compute.internal
/latest/meta-data/iam/info/
'Code': 'Success',
'InstanceProfileArn': 'arn:aws:iam::...snip...:instance-profile/admin-server-DEV',
'InstanceProfileId': '...snip...',
'LastUpdated': '2018-02-08T01:17:18Z'
...snip...
python python-3.x recursion rest amazon-web-services
1
Your narrative of the approach to making sense of the inconsistent behavior of the metadata service seems valid. I can't comment on your implementation of it. My solution was to assume response lines without a trailing slash were final destinations, and hard-code the relatively few exceptions to that rule. Even though it is valid, you probably don't want to do the infinite loop and 404 detection in production because the metadata service is rate-limited and this technique means you'll be making almost 2x the number of requests you actually need in order to gather all the metadata.
â Michael - sqlbot
Feb 8 at 6:07
You could make use of this library: github.com/adamchainz/ec2-metadata
â hjpotter92
Feb 10 at 8:37
@Michael-sqlbot "to assume response lines without a trailing slash were final destinations" Doesn't hold for the very top directory. "you'll be making almost 2x the number of requests you actually need" No, my code above does caching in a variabletried
â nodakai
Feb 11 at 6:23
@hjpotter92 Several fields seem to be missing, at least from the doc
â nodakai
Feb 11 at 6:25
1
@nodakai agreed about the top directory, but those are among the exceptions; the general rule holds. Also, I say ~2x requests because you are traversing one level deeper than you need to at each leaf, only to discover a dead end.
â Michael - sqlbot
Feb 11 at 13:32
add a comment |Â
up vote
4
down vote
favorite
up vote
4
down vote
favorite
Update
- https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html
This seems to be the schema for the instance metadata, though the way they describe it isn't really handy for someone like me.
Original Post
I wrote a tool to fetch all the EC2 instance metadata available under http://169.254.169.254/latest/ . Can somebody review my code and the underlying heuristics to tell if an output from a particular GET
query is
- a "directory listing" guiding me to more
GET
queries available down the path, or - actual metadata represented by the path ?
For example, if you run these two valid queries,
[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/; echo
dynamic
meta-data
user-data
[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
sg-89...snip...
sg-84...snip...
sg-2e...snip...
your program won't know which of them signifies a directory listing and which actual metadata.
So my strategy is
assume every output line guides me to a new
GET
path and try to throw it to 169.254.169.254if it returned a string whose first line is
{
, it must have been a JSON document, so do pretty-printingif any of the child queries didn't return
HTTP 200
, the original query must have returned an actual metadata, so print it outif any of the child queries returned an identical string as the last path component of the original query (with
HTTP 200
,) we're probably trapped in an infinite loop, and the original result must have not been a directory listing but an actual metadata. E.g.[ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
sg-89...snip...
sg-84...snip...
sg-2...snip...
[ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/sg-89...snip...; echo
sg-89...snip...
sg-84...snip...
sg-2e...snip...
Show below is my code:
#!/usr/bin/env python3
import http
import http.client
import os.path
import json
import pprint
import logging
from typing import *
ADDR = '169.254.169.254'
ADDR = '127.0.0.1:8080' # SSH port forwarding
conn = http.client.HTTPConnection(ADDR, timeout=3.)
def join(a: str, b: str) -> str:
return os.path.join(a, b, '')
def check(parent_path: str, filename: str, s: str) -> None:
# time.sleep(.3)
joined_dir = join(parent_path, filename)
if s is None:
return
if s[0] == '{':
print(joined_dir)
pprint.pprint(json.loads(s))
return
file_candidates = [ l for l in s.split('n') if l ]
tried = [ (c, get(join(joined_dir, c))) for c in file_candidates ]
if all(t is None for c, t in tried) or
all(c in t.split() for c, t in tried if t):
# just a string
print(joined_dir)
print(s)
return
for c, t in tried:
if t:
check(joined_dir, c, t)
def get(path: str) -> Optional[str]:
logging.debug("*** (%r)", path)
conn.request('GET', path)
resp = conn.getresponse()
s: Optional[str] = None
if resp.status == http.HTTPStatus.OK:
s = resp.read().decode().strip()
logging.debug("*** => %s", s)
return s
if '__main__' == __name__:
logging.basicConfig(level=logging.WARN,
format="[%(asctime)s.%(msecs)03d] %(levelname)s %(filename)s:%(lineno)d:%(funcName)s %(message)s",
datefmt="%H:%M:%S")
check('/', 'latest', get('/latest/'))
A typical output looks like this, but I'm not even sure if I really recurse into all the available paths.
/latest/dynamic/instance-identity/document/
'accountId': '...snip...',
'architecture': 'x86_64',
'availabilityZone': 'ap-southeast-1a',
'billingProducts': None,
'devpayProductCodes': None,
'imageId': 'ami-4f89f533',
'instanceId': 'i-...snip...',
'instanceType': 't2.nano',
'kernelId': None,
'marketplaceProductCodes': None,
'pendingTime': '2018-02-06T09:12:43Z',
'privateIp': '172....snip...',
'ramdiskId': None,
'region': 'ap-southeast-1',
'version': '2017-09-30'
/latest/dynamic/instance-identity/pkcs7/
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAaCAJIAEggHcewog
...snip...
FQCvZ++/XNsRExoPCHjmzzC2xKegmgAAAAAAAA==
/latest/dynamic/instance-identity/signature/
GjT4Lzh50Je9glKezoLw32/UnZC+VghBSVav8twUFXDFiEIcy+uwiPaEIWmzBVcR4QlVrknTXVGZ
ups/f9mLmUMoJbjb3ZEMWF4CiB0oUh7FNSUNh25KzMPb4sY15Ay7sDLR30uLJpQfJsUdR/T6nLBG
wYC9WdGlj3SNo+/tre0=
/latest/dynamic/instance-identity/rsa2048/
MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwGggCSABIIB
...snip...
gLVMLECQv+gesZ2j42MPDGFUJ1VzsRQg9mZkOc8AAAAAAAA=
/latest/meta-data/ami-id/
ami-4f89f533
/latest/meta-data/ami-launch-index/
0
/latest/meta-data/ami-manifest-path/
(unknown)
/latest/meta-data/block-device-mapping/ami/
/dev/xvda
/latest/meta-data/block-device-mapping/root/
/dev/xvda
/latest/meta-data/hostname/
ip-...snip....ap-southeast-1.compute.internal
/latest/meta-data/iam/info/
'Code': 'Success',
'InstanceProfileArn': 'arn:aws:iam::...snip...:instance-profile/admin-server-DEV',
'InstanceProfileId': '...snip...',
'LastUpdated': '2018-02-08T01:17:18Z'
...snip...
python python-3.x recursion rest amazon-web-services
Update
- https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html
This seems to be the schema for the instance metadata, though the way they describe it isn't really handy for someone like me.
Original Post
I wrote a tool to fetch all the EC2 instance metadata available under http://169.254.169.254/latest/ . Can somebody review my code and the underlying heuristics to tell if an output from a particular GET
query is
- a "directory listing" guiding me to more
GET
queries available down the path, or - actual metadata represented by the path ?
For example, if you run these two valid queries,
[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/; echo
dynamic
meta-data
user-data
[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
sg-89...snip...
sg-84...snip...
sg-2e...snip...
your program won't know which of them signifies a directory listing and which actual metadata.
So my strategy is
assume every output line guides me to a new
GET
path and try to throw it to 169.254.169.254if it returned a string whose first line is
{
, it must have been a JSON document, so do pretty-printingif any of the child queries didn't return
HTTP 200
, the original query must have returned an actual metadata, so print it outif any of the child queries returned an identical string as the last path component of the original query (with
HTTP 200
,) we're probably trapped in an infinite loop, and the original result must have not been a directory listing but an actual metadata. E.g.[ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
sg-89...snip...
sg-84...snip...
sg-2...snip...
[ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/sg-89...snip...; echo
sg-89...snip...
sg-84...snip...
sg-2e...snip...
Show below is my code:
#!/usr/bin/env python3
import http
import http.client
import os.path
import json
import pprint
import logging
from typing import *
ADDR = '169.254.169.254'
ADDR = '127.0.0.1:8080' # SSH port forwarding
conn = http.client.HTTPConnection(ADDR, timeout=3.)
def join(a: str, b: str) -> str:
return os.path.join(a, b, '')
def check(parent_path: str, filename: str, s: str) -> None:
# time.sleep(.3)
joined_dir = join(parent_path, filename)
if s is None:
return
if s[0] == '{':
print(joined_dir)
pprint.pprint(json.loads(s))
return
file_candidates = [ l for l in s.split('n') if l ]
tried = [ (c, get(join(joined_dir, c))) for c in file_candidates ]
if all(t is None for c, t in tried) or
all(c in t.split() for c, t in tried if t):
# just a string
print(joined_dir)
print(s)
return
for c, t in tried:
if t:
check(joined_dir, c, t)
def get(path: str) -> Optional[str]:
logging.debug("*** (%r)", path)
conn.request('GET', path)
resp = conn.getresponse()
s: Optional[str] = None
if resp.status == http.HTTPStatus.OK:
s = resp.read().decode().strip()
logging.debug("*** => %s", s)
return s
if '__main__' == __name__:
logging.basicConfig(level=logging.WARN,
format="[%(asctime)s.%(msecs)03d] %(levelname)s %(filename)s:%(lineno)d:%(funcName)s %(message)s",
datefmt="%H:%M:%S")
check('/', 'latest', get('/latest/'))
A typical output looks like this, but I'm not even sure if I really recurse into all the available paths.
/latest/dynamic/instance-identity/document/
'accountId': '...snip...',
'architecture': 'x86_64',
'availabilityZone': 'ap-southeast-1a',
'billingProducts': None,
'devpayProductCodes': None,
'imageId': 'ami-4f89f533',
'instanceId': 'i-...snip...',
'instanceType': 't2.nano',
'kernelId': None,
'marketplaceProductCodes': None,
'pendingTime': '2018-02-06T09:12:43Z',
'privateIp': '172....snip...',
'ramdiskId': None,
'region': 'ap-southeast-1',
'version': '2017-09-30'
/latest/dynamic/instance-identity/pkcs7/
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAaCAJIAEggHcewog
...snip...
FQCvZ++/XNsRExoPCHjmzzC2xKegmgAAAAAAAA==
/latest/dynamic/instance-identity/signature/
GjT4Lzh50Je9glKezoLw32/UnZC+VghBSVav8twUFXDFiEIcy+uwiPaEIWmzBVcR4QlVrknTXVGZ
ups/f9mLmUMoJbjb3ZEMWF4CiB0oUh7FNSUNh25KzMPb4sY15Ay7sDLR30uLJpQfJsUdR/T6nLBG
wYC9WdGlj3SNo+/tre0=
/latest/dynamic/instance-identity/rsa2048/
MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwGggCSABIIB
...snip...
gLVMLECQv+gesZ2j42MPDGFUJ1VzsRQg9mZkOc8AAAAAAAA=
/latest/meta-data/ami-id/
ami-4f89f533
/latest/meta-data/ami-launch-index/
0
/latest/meta-data/ami-manifest-path/
(unknown)
/latest/meta-data/block-device-mapping/ami/
/dev/xvda
/latest/meta-data/block-device-mapping/root/
/dev/xvda
/latest/meta-data/hostname/
ip-...snip....ap-southeast-1.compute.internal
/latest/meta-data/iam/info/
'Code': 'Success',
'InstanceProfileArn': 'arn:aws:iam::...snip...:instance-profile/admin-server-DEV',
'InstanceProfileId': '...snip...',
'LastUpdated': '2018-02-08T01:17:18Z'
...snip...
python python-3.x recursion rest amazon-web-services
edited Feb 13 at 9:45
asked Feb 8 at 2:09
nodakai
1213
1213
1
Your narrative of the approach to making sense of the inconsistent behavior of the metadata service seems valid. I can't comment on your implementation of it. My solution was to assume response lines without a trailing slash were final destinations, and hard-code the relatively few exceptions to that rule. Even though it is valid, you probably don't want to do the infinite loop and 404 detection in production because the metadata service is rate-limited and this technique means you'll be making almost 2x the number of requests you actually need in order to gather all the metadata.
â Michael - sqlbot
Feb 8 at 6:07
You could make use of this library: github.com/adamchainz/ec2-metadata
â hjpotter92
Feb 10 at 8:37
@Michael-sqlbot "to assume response lines without a trailing slash were final destinations" Doesn't hold for the very top directory. "you'll be making almost 2x the number of requests you actually need" No, my code above does caching in a variabletried
â nodakai
Feb 11 at 6:23
@hjpotter92 Several fields seem to be missing, at least from the doc
â nodakai
Feb 11 at 6:25
1
@nodakai agreed about the top directory, but those are among the exceptions; the general rule holds. Also, I say ~2x requests because you are traversing one level deeper than you need to at each leaf, only to discover a dead end.
â Michael - sqlbot
Feb 11 at 13:32
add a comment |Â
1
Your narrative of the approach to making sense of the inconsistent behavior of the metadata service seems valid. I can't comment on your implementation of it. My solution was to assume response lines without a trailing slash were final destinations, and hard-code the relatively few exceptions to that rule. Even though it is valid, you probably don't want to do the infinite loop and 404 detection in production because the metadata service is rate-limited and this technique means you'll be making almost 2x the number of requests you actually need in order to gather all the metadata.
â Michael - sqlbot
Feb 8 at 6:07
You could make use of this library: github.com/adamchainz/ec2-metadata
â hjpotter92
Feb 10 at 8:37
@Michael-sqlbot "to assume response lines without a trailing slash were final destinations" Doesn't hold for the very top directory. "you'll be making almost 2x the number of requests you actually need" No, my code above does caching in a variabletried
â nodakai
Feb 11 at 6:23
@hjpotter92 Several fields seem to be missing, at least from the doc
â nodakai
Feb 11 at 6:25
1
@nodakai agreed about the top directory, but those are among the exceptions; the general rule holds. Also, I say ~2x requests because you are traversing one level deeper than you need to at each leaf, only to discover a dead end.
â Michael - sqlbot
Feb 11 at 13:32
1
1
Your narrative of the approach to making sense of the inconsistent behavior of the metadata service seems valid. I can't comment on your implementation of it. My solution was to assume response lines without a trailing slash were final destinations, and hard-code the relatively few exceptions to that rule. Even though it is valid, you probably don't want to do the infinite loop and 404 detection in production because the metadata service is rate-limited and this technique means you'll be making almost 2x the number of requests you actually need in order to gather all the metadata.
â Michael - sqlbot
Feb 8 at 6:07
Your narrative of the approach to making sense of the inconsistent behavior of the metadata service seems valid. I can't comment on your implementation of it. My solution was to assume response lines without a trailing slash were final destinations, and hard-code the relatively few exceptions to that rule. Even though it is valid, you probably don't want to do the infinite loop and 404 detection in production because the metadata service is rate-limited and this technique means you'll be making almost 2x the number of requests you actually need in order to gather all the metadata.
â Michael - sqlbot
Feb 8 at 6:07
You could make use of this library: github.com/adamchainz/ec2-metadata
â hjpotter92
Feb 10 at 8:37
You could make use of this library: github.com/adamchainz/ec2-metadata
â hjpotter92
Feb 10 at 8:37
@Michael-sqlbot "to assume response lines without a trailing slash were final destinations" Doesn't hold for the very top directory. "you'll be making almost 2x the number of requests you actually need" No, my code above does caching in a variable
tried
â nodakai
Feb 11 at 6:23
@Michael-sqlbot "to assume response lines without a trailing slash were final destinations" Doesn't hold for the very top directory. "you'll be making almost 2x the number of requests you actually need" No, my code above does caching in a variable
tried
â nodakai
Feb 11 at 6:23
@hjpotter92 Several fields seem to be missing, at least from the doc
â nodakai
Feb 11 at 6:25
@hjpotter92 Several fields seem to be missing, at least from the doc
â nodakai
Feb 11 at 6:25
1
1
@nodakai agreed about the top directory, but those are among the exceptions; the general rule holds. Also, I say ~2x requests because you are traversing one level deeper than you need to at each leaf, only to discover a dead end.
â Michael - sqlbot
Feb 11 at 13:32
@nodakai agreed about the top directory, but those are among the exceptions; the general rule holds. Also, I say ~2x requests because you are traversing one level deeper than you need to at each leaf, only to discover a dead end.
â Michael - sqlbot
Feb 11 at 13:32
add a comment |Â
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
active
oldest
votes
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f187066%2faws-ec2-metadata-fetcher-in-python%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
Your narrative of the approach to making sense of the inconsistent behavior of the metadata service seems valid. I can't comment on your implementation of it. My solution was to assume response lines without a trailing slash were final destinations, and hard-code the relatively few exceptions to that rule. Even though it is valid, you probably don't want to do the infinite loop and 404 detection in production because the metadata service is rate-limited and this technique means you'll be making almost 2x the number of requests you actually need in order to gather all the metadata.
â Michael - sqlbot
Feb 8 at 6:07
You could make use of this library: github.com/adamchainz/ec2-metadata
â hjpotter92
Feb 10 at 8:37
@Michael-sqlbot "to assume response lines without a trailing slash were final destinations" Doesn't hold for the very top directory. "you'll be making almost 2x the number of requests you actually need" No, my code above does caching in a variable
tried
â nodakai
Feb 11 at 6:23
@hjpotter92 Several fields seem to be missing, at least from the doc
â nodakai
Feb 11 at 6:25
1
@nodakai agreed about the top directory, but those are among the exceptions; the general rule holds. Also, I say ~2x requests because you are traversing one level deeper than you need to at each leaf, only to discover a dead end.
â Michael - sqlbot
Feb 11 at 13:32