AWS EC2 metadata fetcher in Python

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
4
down vote

favorite












Update



  • https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html

This seems to be the schema for the instance metadata, though the way they describe it isn't really handy for someone like me.



Original Post



I wrote a tool to fetch all the EC2 instance metadata available under http://169.254.169.254/latest/ . Can somebody review my code and the underlying heuristics to tell if an output from a particular GET query is



  1. a "directory listing" guiding me to more GET queries available down the path, or

  2. actual metadata represented by the path ?

For example, if you run these two valid queries,



[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/; echo
dynamic
meta-data
user-data
[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
sg-89...snip...
sg-84...snip...
sg-2e...snip...


your program won't know which of them signifies a directory listing and which actual metadata.



So my strategy is



  1. assume every output line guides me to a new GET path and try to throw it to 169.254.169.254


  2. if it returned a string whose first line is {, it must have been a JSON document, so do pretty-printing


  3. if any of the child queries didn't return HTTP 200, the original query must have returned an actual metadata, so print it out



  4. if any of the child queries returned an identical string as the last path component of the original query (with HTTP 200,) we're probably trapped in an infinite loop, and the original result must have not been a directory listing but an actual metadata. E.g.



    [ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
    sg-89...snip...
    sg-84...snip...
    sg-2...snip...
    [ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/sg-89...snip...; echo
    sg-89...snip...
    sg-84...snip...
    sg-2e...snip...


Show below is my code:



#!/usr/bin/env python3

import http
import http.client
import os.path
import json
import pprint
import logging
from typing import *

ADDR = '169.254.169.254'
ADDR = '127.0.0.1:8080' # SSH port forwarding
conn = http.client.HTTPConnection(ADDR, timeout=3.)


def join(a: str, b: str) -> str:
return os.path.join(a, b, '')


def check(parent_path: str, filename: str, s: str) -> None:
# time.sleep(.3)
joined_dir = join(parent_path, filename)
if s is None:
return

if s[0] == '{':
print(joined_dir)
pprint.pprint(json.loads(s))
return

file_candidates = [ l for l in s.split('n') if l ]
tried = [ (c, get(join(joined_dir, c))) for c in file_candidates ]
if all(t is None for c, t in tried) or
all(c in t.split() for c, t in tried if t):
# just a string
print(joined_dir)
print(s)
return

for c, t in tried:
if t:
check(joined_dir, c, t)


def get(path: str) -> Optional[str]:
logging.debug("*** (%r)", path)
conn.request('GET', path)
resp = conn.getresponse()

s: Optional[str] = None
if resp.status == http.HTTPStatus.OK:
s = resp.read().decode().strip()
logging.debug("*** => %s", s)
return s


if '__main__' == __name__:
logging.basicConfig(level=logging.WARN,
format="[%(asctime)s.%(msecs)03d] %(levelname)s %(filename)s:%(lineno)d:%(funcName)s %(message)s",
datefmt="%H:%M:%S")
check('/', 'latest', get('/latest/'))


A typical output looks like this, but I'm not even sure if I really recurse into all the available paths.



/latest/dynamic/instance-identity/document/
'accountId': '...snip...',
'architecture': 'x86_64',
'availabilityZone': 'ap-southeast-1a',
'billingProducts': None,
'devpayProductCodes': None,
'imageId': 'ami-4f89f533',
'instanceId': 'i-...snip...',
'instanceType': 't2.nano',
'kernelId': None,
'marketplaceProductCodes': None,
'pendingTime': '2018-02-06T09:12:43Z',
'privateIp': '172....snip...',
'ramdiskId': None,
'region': 'ap-southeast-1',
'version': '2017-09-30'
/latest/dynamic/instance-identity/pkcs7/
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAaCAJIAEggHcewog
...snip...
FQCvZ++/XNsRExoPCHjmzzC2xKegmgAAAAAAAA==
/latest/dynamic/instance-identity/signature/
GjT4Lzh50Je9glKezoLw32/UnZC+VghBSVav8twUFXDFiEIcy+uwiPaEIWmzBVcR4QlVrknTXVGZ
ups/f9mLmUMoJbjb3ZEMWF4CiB0oUh7FNSUNh25KzMPb4sY15Ay7sDLR30uLJpQfJsUdR/T6nLBG
wYC9WdGlj3SNo+/tre0=
/latest/dynamic/instance-identity/rsa2048/
MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwGggCSABIIB
...snip...
gLVMLECQv+gesZ2j42MPDGFUJ1VzsRQg9mZkOc8AAAAAAAA=
/latest/meta-data/ami-id/
ami-4f89f533
/latest/meta-data/ami-launch-index/
0
/latest/meta-data/ami-manifest-path/
(unknown)
/latest/meta-data/block-device-mapping/ami/
/dev/xvda
/latest/meta-data/block-device-mapping/root/
/dev/xvda
/latest/meta-data/hostname/
ip-...snip....ap-southeast-1.compute.internal
/latest/meta-data/iam/info/
'Code': 'Success',
'InstanceProfileArn': 'arn:aws:iam::...snip...:instance-profile/admin-server-DEV',
'InstanceProfileId': '...snip...',
'LastUpdated': '2018-02-08T01:17:18Z'
...snip...






share|improve this question

















  • 1




    Your narrative of the approach to making sense of the inconsistent behavior of the metadata service seems valid. I can't comment on your implementation of it. My solution was to assume response lines without a trailing slash were final destinations, and hard-code the relatively few exceptions to that rule. Even though it is valid, you probably don't want to do the infinite loop and 404 detection in production because the metadata service is rate-limited and this technique means you'll be making almost 2x the number of requests you actually need in order to gather all the metadata.
    – Michael - sqlbot
    Feb 8 at 6:07











  • You could make use of this library: github.com/adamchainz/ec2-metadata
    – hjpotter92
    Feb 10 at 8:37










  • @Michael-sqlbot "to assume response lines without a trailing slash were final destinations" Doesn't hold for the very top directory. "you'll be making almost 2x the number of requests you actually need" No, my code above does caching in a variable tried
    – nodakai
    Feb 11 at 6:23










  • @hjpotter92 Several fields seem to be missing, at least from the doc
    – nodakai
    Feb 11 at 6:25






  • 1




    @nodakai agreed about the top directory, but those are among the exceptions; the general rule holds. Also, I say ~2x requests because you are traversing one level deeper than you need to at each leaf, only to discover a dead end.
    – Michael - sqlbot
    Feb 11 at 13:32
















up vote
4
down vote

favorite












Update



  • https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html

This seems to be the schema for the instance metadata, though the way they describe it isn't really handy for someone like me.



Original Post



I wrote a tool to fetch all the EC2 instance metadata available under http://169.254.169.254/latest/ . Can somebody review my code and the underlying heuristics to tell if an output from a particular GET query is



  1. a "directory listing" guiding me to more GET queries available down the path, or

  2. actual metadata represented by the path ?

For example, if you run these two valid queries,



[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/; echo
dynamic
meta-data
user-data
[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
sg-89...snip...
sg-84...snip...
sg-2e...snip...


your program won't know which of them signifies a directory listing and which actual metadata.



So my strategy is



  1. assume every output line guides me to a new GET path and try to throw it to 169.254.169.254


  2. if it returned a string whose first line is {, it must have been a JSON document, so do pretty-printing


  3. if any of the child queries didn't return HTTP 200, the original query must have returned an actual metadata, so print it out



  4. if any of the child queries returned an identical string as the last path component of the original query (with HTTP 200,) we're probably trapped in an infinite loop, and the original result must have not been a directory listing but an actual metadata. E.g.



    [ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
    sg-89...snip...
    sg-84...snip...
    sg-2...snip...
    [ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/sg-89...snip...; echo
    sg-89...snip...
    sg-84...snip...
    sg-2e...snip...


Show below is my code:



#!/usr/bin/env python3

import http
import http.client
import os.path
import json
import pprint
import logging
from typing import *

ADDR = '169.254.169.254'
ADDR = '127.0.0.1:8080' # SSH port forwarding
conn = http.client.HTTPConnection(ADDR, timeout=3.)


def join(a: str, b: str) -> str:
return os.path.join(a, b, '')


def check(parent_path: str, filename: str, s: str) -> None:
# time.sleep(.3)
joined_dir = join(parent_path, filename)
if s is None:
return

if s[0] == '{':
print(joined_dir)
pprint.pprint(json.loads(s))
return

file_candidates = [ l for l in s.split('n') if l ]
tried = [ (c, get(join(joined_dir, c))) for c in file_candidates ]
if all(t is None for c, t in tried) or
all(c in t.split() for c, t in tried if t):
# just a string
print(joined_dir)
print(s)
return

for c, t in tried:
if t:
check(joined_dir, c, t)


def get(path: str) -> Optional[str]:
logging.debug("*** (%r)", path)
conn.request('GET', path)
resp = conn.getresponse()

s: Optional[str] = None
if resp.status == http.HTTPStatus.OK:
s = resp.read().decode().strip()
logging.debug("*** => %s", s)
return s


if '__main__' == __name__:
logging.basicConfig(level=logging.WARN,
format="[%(asctime)s.%(msecs)03d] %(levelname)s %(filename)s:%(lineno)d:%(funcName)s %(message)s",
datefmt="%H:%M:%S")
check('/', 'latest', get('/latest/'))


A typical output looks like this, but I'm not even sure if I really recurse into all the available paths.



/latest/dynamic/instance-identity/document/
'accountId': '...snip...',
'architecture': 'x86_64',
'availabilityZone': 'ap-southeast-1a',
'billingProducts': None,
'devpayProductCodes': None,
'imageId': 'ami-4f89f533',
'instanceId': 'i-...snip...',
'instanceType': 't2.nano',
'kernelId': None,
'marketplaceProductCodes': None,
'pendingTime': '2018-02-06T09:12:43Z',
'privateIp': '172....snip...',
'ramdiskId': None,
'region': 'ap-southeast-1',
'version': '2017-09-30'
/latest/dynamic/instance-identity/pkcs7/
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAaCAJIAEggHcewog
...snip...
FQCvZ++/XNsRExoPCHjmzzC2xKegmgAAAAAAAA==
/latest/dynamic/instance-identity/signature/
GjT4Lzh50Je9glKezoLw32/UnZC+VghBSVav8twUFXDFiEIcy+uwiPaEIWmzBVcR4QlVrknTXVGZ
ups/f9mLmUMoJbjb3ZEMWF4CiB0oUh7FNSUNh25KzMPb4sY15Ay7sDLR30uLJpQfJsUdR/T6nLBG
wYC9WdGlj3SNo+/tre0=
/latest/dynamic/instance-identity/rsa2048/
MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwGggCSABIIB
...snip...
gLVMLECQv+gesZ2j42MPDGFUJ1VzsRQg9mZkOc8AAAAAAAA=
/latest/meta-data/ami-id/
ami-4f89f533
/latest/meta-data/ami-launch-index/
0
/latest/meta-data/ami-manifest-path/
(unknown)
/latest/meta-data/block-device-mapping/ami/
/dev/xvda
/latest/meta-data/block-device-mapping/root/
/dev/xvda
/latest/meta-data/hostname/
ip-...snip....ap-southeast-1.compute.internal
/latest/meta-data/iam/info/
'Code': 'Success',
'InstanceProfileArn': 'arn:aws:iam::...snip...:instance-profile/admin-server-DEV',
'InstanceProfileId': '...snip...',
'LastUpdated': '2018-02-08T01:17:18Z'
...snip...






share|improve this question

















  • 1




    Your narrative of the approach to making sense of the inconsistent behavior of the metadata service seems valid. I can't comment on your implementation of it. My solution was to assume response lines without a trailing slash were final destinations, and hard-code the relatively few exceptions to that rule. Even though it is valid, you probably don't want to do the infinite loop and 404 detection in production because the metadata service is rate-limited and this technique means you'll be making almost 2x the number of requests you actually need in order to gather all the metadata.
    – Michael - sqlbot
    Feb 8 at 6:07











  • You could make use of this library: github.com/adamchainz/ec2-metadata
    – hjpotter92
    Feb 10 at 8:37










  • @Michael-sqlbot "to assume response lines without a trailing slash were final destinations" Doesn't hold for the very top directory. "you'll be making almost 2x the number of requests you actually need" No, my code above does caching in a variable tried
    – nodakai
    Feb 11 at 6:23










  • @hjpotter92 Several fields seem to be missing, at least from the doc
    – nodakai
    Feb 11 at 6:25






  • 1




    @nodakai agreed about the top directory, but those are among the exceptions; the general rule holds. Also, I say ~2x requests because you are traversing one level deeper than you need to at each leaf, only to discover a dead end.
    – Michael - sqlbot
    Feb 11 at 13:32












up vote
4
down vote

favorite









up vote
4
down vote

favorite











Update



  • https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html

This seems to be the schema for the instance metadata, though the way they describe it isn't really handy for someone like me.



Original Post



I wrote a tool to fetch all the EC2 instance metadata available under http://169.254.169.254/latest/ . Can somebody review my code and the underlying heuristics to tell if an output from a particular GET query is



  1. a "directory listing" guiding me to more GET queries available down the path, or

  2. actual metadata represented by the path ?

For example, if you run these two valid queries,



[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/; echo
dynamic
meta-data
user-data
[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
sg-89...snip...
sg-84...snip...
sg-2e...snip...


your program won't know which of them signifies a directory listing and which actual metadata.



So my strategy is



  1. assume every output line guides me to a new GET path and try to throw it to 169.254.169.254


  2. if it returned a string whose first line is {, it must have been a JSON document, so do pretty-printing


  3. if any of the child queries didn't return HTTP 200, the original query must have returned an actual metadata, so print it out



  4. if any of the child queries returned an identical string as the last path component of the original query (with HTTP 200,) we're probably trapped in an infinite loop, and the original result must have not been a directory listing but an actual metadata. E.g.



    [ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
    sg-89...snip...
    sg-84...snip...
    sg-2...snip...
    [ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/sg-89...snip...; echo
    sg-89...snip...
    sg-84...snip...
    sg-2e...snip...


Show below is my code:



#!/usr/bin/env python3

import http
import http.client
import os.path
import json
import pprint
import logging
from typing import *

ADDR = '169.254.169.254'
ADDR = '127.0.0.1:8080' # SSH port forwarding
conn = http.client.HTTPConnection(ADDR, timeout=3.)


def join(a: str, b: str) -> str:
return os.path.join(a, b, '')


def check(parent_path: str, filename: str, s: str) -> None:
# time.sleep(.3)
joined_dir = join(parent_path, filename)
if s is None:
return

if s[0] == '{':
print(joined_dir)
pprint.pprint(json.loads(s))
return

file_candidates = [ l for l in s.split('n') if l ]
tried = [ (c, get(join(joined_dir, c))) for c in file_candidates ]
if all(t is None for c, t in tried) or
all(c in t.split() for c, t in tried if t):
# just a string
print(joined_dir)
print(s)
return

for c, t in tried:
if t:
check(joined_dir, c, t)


def get(path: str) -> Optional[str]:
logging.debug("*** (%r)", path)
conn.request('GET', path)
resp = conn.getresponse()

s: Optional[str] = None
if resp.status == http.HTTPStatus.OK:
s = resp.read().decode().strip()
logging.debug("*** => %s", s)
return s


if '__main__' == __name__:
logging.basicConfig(level=logging.WARN,
format="[%(asctime)s.%(msecs)03d] %(levelname)s %(filename)s:%(lineno)d:%(funcName)s %(message)s",
datefmt="%H:%M:%S")
check('/', 'latest', get('/latest/'))


A typical output looks like this, but I'm not even sure if I really recurse into all the available paths.



/latest/dynamic/instance-identity/document/
'accountId': '...snip...',
'architecture': 'x86_64',
'availabilityZone': 'ap-southeast-1a',
'billingProducts': None,
'devpayProductCodes': None,
'imageId': 'ami-4f89f533',
'instanceId': 'i-...snip...',
'instanceType': 't2.nano',
'kernelId': None,
'marketplaceProductCodes': None,
'pendingTime': '2018-02-06T09:12:43Z',
'privateIp': '172....snip...',
'ramdiskId': None,
'region': 'ap-southeast-1',
'version': '2017-09-30'
/latest/dynamic/instance-identity/pkcs7/
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAaCAJIAEggHcewog
...snip...
FQCvZ++/XNsRExoPCHjmzzC2xKegmgAAAAAAAA==
/latest/dynamic/instance-identity/signature/
GjT4Lzh50Je9glKezoLw32/UnZC+VghBSVav8twUFXDFiEIcy+uwiPaEIWmzBVcR4QlVrknTXVGZ
ups/f9mLmUMoJbjb3ZEMWF4CiB0oUh7FNSUNh25KzMPb4sY15Ay7sDLR30uLJpQfJsUdR/T6nLBG
wYC9WdGlj3SNo+/tre0=
/latest/dynamic/instance-identity/rsa2048/
MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwGggCSABIIB
...snip...
gLVMLECQv+gesZ2j42MPDGFUJ1VzsRQg9mZkOc8AAAAAAAA=
/latest/meta-data/ami-id/
ami-4f89f533
/latest/meta-data/ami-launch-index/
0
/latest/meta-data/ami-manifest-path/
(unknown)
/latest/meta-data/block-device-mapping/ami/
/dev/xvda
/latest/meta-data/block-device-mapping/root/
/dev/xvda
/latest/meta-data/hostname/
ip-...snip....ap-southeast-1.compute.internal
/latest/meta-data/iam/info/
'Code': 'Success',
'InstanceProfileArn': 'arn:aws:iam::...snip...:instance-profile/admin-server-DEV',
'InstanceProfileId': '...snip...',
'LastUpdated': '2018-02-08T01:17:18Z'
...snip...






share|improve this question













Update



  • https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html

This seems to be the schema for the instance metadata, though the way they describe it isn't really handy for someone like me.



Original Post



I wrote a tool to fetch all the EC2 instance metadata available under http://169.254.169.254/latest/ . Can somebody review my code and the underlying heuristics to tell if an output from a particular GET query is



  1. a "directory listing" guiding me to more GET queries available down the path, or

  2. actual metadata represented by the path ?

For example, if you run these two valid queries,



[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/; echo
dynamic
meta-data
user-data
[ec2-user@ip-...snip... ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
sg-89...snip...
sg-84...snip...
sg-2e...snip...


your program won't know which of them signifies a directory listing and which actual metadata.



So my strategy is



  1. assume every output line guides me to a new GET path and try to throw it to 169.254.169.254


  2. if it returned a string whose first line is {, it must have been a JSON document, so do pretty-printing


  3. if any of the child queries didn't return HTTP 200, the original query must have returned an actual metadata, so print it out



  4. if any of the child queries returned an identical string as the last path component of the original query (with HTTP 200,) we're probably trapped in an infinite loop, and the original result must have not been a directory listing but an actual metadata. E.g.



    [ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/; echo
    sg-89...snip...
    sg-84...snip...
    sg-2...snip...
    [ec2-user@ip-172-24-0-105 ~]$ curl 169.254.169.254/latest/meta-data/network/interfaces/macs/06:...snip...:e6/security-group-ids/sg-89...snip...; echo
    sg-89...snip...
    sg-84...snip...
    sg-2e...snip...


Show below is my code:



#!/usr/bin/env python3

import http
import http.client
import os.path
import json
import pprint
import logging
from typing import *

ADDR = '169.254.169.254'
ADDR = '127.0.0.1:8080' # SSH port forwarding
conn = http.client.HTTPConnection(ADDR, timeout=3.)


def join(a: str, b: str) -> str:
return os.path.join(a, b, '')


def check(parent_path: str, filename: str, s: str) -> None:
# time.sleep(.3)
joined_dir = join(parent_path, filename)
if s is None:
return

if s[0] == '{':
print(joined_dir)
pprint.pprint(json.loads(s))
return

file_candidates = [ l for l in s.split('n') if l ]
tried = [ (c, get(join(joined_dir, c))) for c in file_candidates ]
if all(t is None for c, t in tried) or
all(c in t.split() for c, t in tried if t):
# just a string
print(joined_dir)
print(s)
return

for c, t in tried:
if t:
check(joined_dir, c, t)


def get(path: str) -> Optional[str]:
logging.debug("*** (%r)", path)
conn.request('GET', path)
resp = conn.getresponse()

s: Optional[str] = None
if resp.status == http.HTTPStatus.OK:
s = resp.read().decode().strip()
logging.debug("*** => %s", s)
return s


if '__main__' == __name__:
logging.basicConfig(level=logging.WARN,
format="[%(asctime)s.%(msecs)03d] %(levelname)s %(filename)s:%(lineno)d:%(funcName)s %(message)s",
datefmt="%H:%M:%S")
check('/', 'latest', get('/latest/'))


A typical output looks like this, but I'm not even sure if I really recurse into all the available paths.



/latest/dynamic/instance-identity/document/
'accountId': '...snip...',
'architecture': 'x86_64',
'availabilityZone': 'ap-southeast-1a',
'billingProducts': None,
'devpayProductCodes': None,
'imageId': 'ami-4f89f533',
'instanceId': 'i-...snip...',
'instanceType': 't2.nano',
'kernelId': None,
'marketplaceProductCodes': None,
'pendingTime': '2018-02-06T09:12:43Z',
'privateIp': '172....snip...',
'ramdiskId': None,
'region': 'ap-southeast-1',
'version': '2017-09-30'
/latest/dynamic/instance-identity/pkcs7/
MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAaCAJIAEggHcewog
...snip...
FQCvZ++/XNsRExoPCHjmzzC2xKegmgAAAAAAAA==
/latest/dynamic/instance-identity/signature/
GjT4Lzh50Je9glKezoLw32/UnZC+VghBSVav8twUFXDFiEIcy+uwiPaEIWmzBVcR4QlVrknTXVGZ
ups/f9mLmUMoJbjb3ZEMWF4CiB0oUh7FNSUNh25KzMPb4sY15Ay7sDLR30uLJpQfJsUdR/T6nLBG
wYC9WdGlj3SNo+/tre0=
/latest/dynamic/instance-identity/rsa2048/
MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwGggCSABIIB
...snip...
gLVMLECQv+gesZ2j42MPDGFUJ1VzsRQg9mZkOc8AAAAAAAA=
/latest/meta-data/ami-id/
ami-4f89f533
/latest/meta-data/ami-launch-index/
0
/latest/meta-data/ami-manifest-path/
(unknown)
/latest/meta-data/block-device-mapping/ami/
/dev/xvda
/latest/meta-data/block-device-mapping/root/
/dev/xvda
/latest/meta-data/hostname/
ip-...snip....ap-southeast-1.compute.internal
/latest/meta-data/iam/info/
'Code': 'Success',
'InstanceProfileArn': 'arn:aws:iam::...snip...:instance-profile/admin-server-DEV',
'InstanceProfileId': '...snip...',
'LastUpdated': '2018-02-08T01:17:18Z'
...snip...








share|improve this question












share|improve this question




share|improve this question








edited Feb 13 at 9:45
























asked Feb 8 at 2:09









nodakai

1213




1213







  • 1




    Your narrative of the approach to making sense of the inconsistent behavior of the metadata service seems valid. I can't comment on your implementation of it. My solution was to assume response lines without a trailing slash were final destinations, and hard-code the relatively few exceptions to that rule. Even though it is valid, you probably don't want to do the infinite loop and 404 detection in production because the metadata service is rate-limited and this technique means you'll be making almost 2x the number of requests you actually need in order to gather all the metadata.
    – Michael - sqlbot
    Feb 8 at 6:07











  • You could make use of this library: github.com/adamchainz/ec2-metadata
    – hjpotter92
    Feb 10 at 8:37










  • @Michael-sqlbot "to assume response lines without a trailing slash were final destinations" Doesn't hold for the very top directory. "you'll be making almost 2x the number of requests you actually need" No, my code above does caching in a variable tried
    – nodakai
    Feb 11 at 6:23










  • @hjpotter92 Several fields seem to be missing, at least from the doc
    – nodakai
    Feb 11 at 6:25






  • 1




    @nodakai agreed about the top directory, but those are among the exceptions; the general rule holds. Also, I say ~2x requests because you are traversing one level deeper than you need to at each leaf, only to discover a dead end.
    – Michael - sqlbot
    Feb 11 at 13:32












  • 1




    Your narrative of the approach to making sense of the inconsistent behavior of the metadata service seems valid. I can't comment on your implementation of it. My solution was to assume response lines without a trailing slash were final destinations, and hard-code the relatively few exceptions to that rule. Even though it is valid, you probably don't want to do the infinite loop and 404 detection in production because the metadata service is rate-limited and this technique means you'll be making almost 2x the number of requests you actually need in order to gather all the metadata.
    – Michael - sqlbot
    Feb 8 at 6:07











  • You could make use of this library: github.com/adamchainz/ec2-metadata
    – hjpotter92
    Feb 10 at 8:37










  • @Michael-sqlbot "to assume response lines without a trailing slash were final destinations" Doesn't hold for the very top directory. "you'll be making almost 2x the number of requests you actually need" No, my code above does caching in a variable tried
    – nodakai
    Feb 11 at 6:23










  • @hjpotter92 Several fields seem to be missing, at least from the doc
    – nodakai
    Feb 11 at 6:25






  • 1




    @nodakai agreed about the top directory, but those are among the exceptions; the general rule holds. Also, I say ~2x requests because you are traversing one level deeper than you need to at each leaf, only to discover a dead end.
    – Michael - sqlbot
    Feb 11 at 13:32







1




1




Your narrative of the approach to making sense of the inconsistent behavior of the metadata service seems valid. I can't comment on your implementation of it. My solution was to assume response lines without a trailing slash were final destinations, and hard-code the relatively few exceptions to that rule. Even though it is valid, you probably don't want to do the infinite loop and 404 detection in production because the metadata service is rate-limited and this technique means you'll be making almost 2x the number of requests you actually need in order to gather all the metadata.
– Michael - sqlbot
Feb 8 at 6:07





Your narrative of the approach to making sense of the inconsistent behavior of the metadata service seems valid. I can't comment on your implementation of it. My solution was to assume response lines without a trailing slash were final destinations, and hard-code the relatively few exceptions to that rule. Even though it is valid, you probably don't want to do the infinite loop and 404 detection in production because the metadata service is rate-limited and this technique means you'll be making almost 2x the number of requests you actually need in order to gather all the metadata.
– Michael - sqlbot
Feb 8 at 6:07













You could make use of this library: github.com/adamchainz/ec2-metadata
– hjpotter92
Feb 10 at 8:37




You could make use of this library: github.com/adamchainz/ec2-metadata
– hjpotter92
Feb 10 at 8:37












@Michael-sqlbot "to assume response lines without a trailing slash were final destinations" Doesn't hold for the very top directory. "you'll be making almost 2x the number of requests you actually need" No, my code above does caching in a variable tried
– nodakai
Feb 11 at 6:23




@Michael-sqlbot "to assume response lines without a trailing slash were final destinations" Doesn't hold for the very top directory. "you'll be making almost 2x the number of requests you actually need" No, my code above does caching in a variable tried
– nodakai
Feb 11 at 6:23












@hjpotter92 Several fields seem to be missing, at least from the doc
– nodakai
Feb 11 at 6:25




@hjpotter92 Several fields seem to be missing, at least from the doc
– nodakai
Feb 11 at 6:25




1




1




@nodakai agreed about the top directory, but those are among the exceptions; the general rule holds. Also, I say ~2x requests because you are traversing one level deeper than you need to at each leaf, only to discover a dead end.
– Michael - sqlbot
Feb 11 at 13:32




@nodakai agreed about the top directory, but those are among the exceptions; the general rule holds. Also, I say ~2x requests because you are traversing one level deeper than you need to at each leaf, only to discover a dead end.
– Michael - sqlbot
Feb 11 at 13:32















active

oldest

votes











Your Answer




StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
);
);
, "mathjax-editing");

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "196"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f187066%2faws-ec2-metadata-fetcher-in-python%23new-answer', 'question_page');

);

Post as a guest



































active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes










 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f187066%2faws-ec2-metadata-fetcher-in-python%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

Greedy Best First Search implementation in Rust

Function to Return a JSON Like Objects Using VBA Collections and Arrays

C++11 CLH Lock Implementation