HYDROLIX BLOG

Ponderings, insights and industry updates

Index and enrich Akamai logs in realtime

June 7, 2022

Author: David Sztykman |

Tags: , , ,

Akamai new feature Datastream v2 allows streaming in real time of logs into HTTPS endpoint.

In this blog post we’ll see how to setup Akamai and Hydrolix to receive, parse and index logs in realtime.

Setup Akamai

To setup Akamai Datastream you can follow the documentation:

Create a new DataStream:

  • Destination select Custom HTTPS
  • Name enter Hydrolix
  • Endpoint URL https://hydrolixhostname/ingest/event
  • Authentication select none
  • Go to Custom Header and add the following:
    X-HDX-Table: project.table
    X-HDX-Transform: akamai_transform
  • Check the box Send compressed data
  • In the Datasets page, pick JSON logs for the log format:

The summary should look like that:

Once datastream configuration is deployed you can update your regular delivery configuration to deliver logs using datastream v2.

In property manager select the behavior DataStream2 and specify the the Stream Name configured earlier and the sampling rate to apply:

Indexing

Akamai sends the following data by default:

{
  "version": 1,
  "customField": "any-custom-value",
  "reqId": "1239f220",
  "reqTimeSec": "1573840000",
  "city": "HERNDON",
  "state": "Virginia",
  "country": "US",
  "cacheStatus": "1",
  "breadcrumbs": "//BC/%5Ba=6.7.8.9,c=g,k=0,l=1%5D",
  "dnsLookupTimeMSec": "50",
  "transferTimeMSec": "125",
  "turnAroundTimeMSec": "11",
  "errorCode": "ERR_ACCESS_DENIED|fwd_acl",
  "reqEndTimeMSec": "3",
  "xForwardedFor": "1.2.3.4",
  "maxAgeSec": "3600",
  "referer": "https%3A%2F%2Ftest.referrer.net%2Fen-US%2Fdocs%2FWeb%2Ftest",
  "range": "37334-42356",
  "cookie": "cookie-content",
  "accLang": "en-US",
  "reqPort": "443",
  "reqPath": "/path1/path2/file.ext",
  "reqMethod": "GET",
  "reqHost": "test.hostname.net",
  "proto": "HTTPS",
  "statusCode": "206",
  "cliIP": "4.5.6.7",
  "cp": "123456",
  "bytes": "4995",
  "rspContentLen": "5000",
  "rspContentType": "text/html",
  "UA": "Mozilla%2F5.0+%28Macintosh%3B+Intel+Mac+OS+X+10_14_3%29",
  "tlsOverheadTimeMSec": "0",
  "tlsVersion": "TLSv1",
  "objSize": "484",
  "uncompressedSize": "484",
  "overheadBytes": "232",
  "totalBytes": "0",
  "queryStr": "param=value"
}

Some of those fields are easily parsed and indexed, but some requires more advanced treatment, in particular the breadcrumbs.

In this example the breadcrumbs data is urlencoded, we need to decode it and then parse the information it contains.
The value we are working with are://BC/[a=6.7.8.9,c=g,k=0,l=1]

By reading the documentation on breadcrumbs we have multiple information we can extract:

  • a =6.7.8.9 component IP
  • c=g This is at the edge ghost
  • k=0 request end time in ms
  • l=1 turn around time in ms

There are several more information which can be included into the breadcrumbs.

Being able to extract those information at indexing is very valuable, as it’s a one time process and the expensive extraction is done only once.

At Hydrolix we have added a new feature allowing users to write SQL statement at indexing time.

We can write a SQL statement to extract the Edge_IP:

SELECT extract(extract(decodeURLComponent(breadcrumbs), '(\[[^[]*c=g[^]]*\])'), 'a=([^,\]]+)') as Edge_IP

Let’s decompose this statement:

  • decodeURLComponent(breadcrumbs) is to extract the breadcrumbs in decoded URL format.
  • extract(decoded_breadcrumbs, '(\[[^[]*c=g[^]]*\])') is a regex to extract all the data between [ and ] which contains c=g (c=g means edge).
  • And finally extract(from_previous_extract, 'a=([^,\]]+)') is the final regex to extract the value after a=.

To summarise in this select we are extracting the value after a= where [ c=g ] and put the extracted information into a new column called Edge_IP.

In this case we are creating a new column based on the extracted value of the incoming data.

A few other column needs some transformation, the UA is urlencoded and the referer too.
We can override those in the SQL statement:

SELECT decodeURLComponent(UA) as UA, decodeURLComponent(referer) as referer

This example is overriding the current value and replacing it with the urldecoded value.

One critical aspect is the latency of ingestion, we can calculate it dynamically using:

SELECT now() - reqTimeSec as latency

This will calculate the difference in seconds between when the user requested the content and the time we receive the data and create a partition.

The full SQL statement we are using including all the breadcrumbs is the following:

SELECT now() - reqTimeSec as latency,
decodeURLComponent(assumeNotNull(UA)) as UA,
decodeURLComponent(assumeNotNull(state)) as state,
extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '\/\/BC\/(\S*)') as breadcrumbs,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=g[^]]*\])'), 'a=([^,\]]+)') as Edge_IP,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=g[^]]*\])'), 'b=([^,\]]+)') as Edge_RequestID,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=g[^]]*\])'), 'k=([^,\]]+)') as Edge_RequestEndTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=g[^]]*\])'), 'l=([^,\]]+)') as Edge_TurnAroundTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=g[^]]*\])'), 'm=([^,\]]+)') as Edge_DNSLookupTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=g[^]]*\])'), 'n=([^,\]]+)') as Edge_GeoInfo,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=g[^]]*\])'), 'o=([^,\]]+)') as Edge_ASN,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=o[^]]*\])'), 'a=([^,\]]+)') as Origin_IP,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=o[^]]*\])'), 'b=([^,\]]+)') as Origin_RequestID,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=o[^]]*\])'), 'k=([^,\]]+)') as Origin_RequestEndTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=o[^]]*\])'), 'l=([^,\]]+)') as Origin_TurnAroundTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=o[^]]*\])'), 'm=([^,\]]+)') as Origin_DNSLookupTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=o[^]]*\])'), 'n=([^,\]]+)') as Origin_GeoInfo,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\[[^[]*c=o[^]]*\])'), 'o=([^,\]]+)') as Origin_ASN,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=p[^]]*\])'), 'a=([^,\]]+)') as Peer_IP,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=p[^]]*\])'), 'b=([^,\]]+)') as Peer_RequestID,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=p[^]]*\])'), 'k=([^,\]]+)') as Peer_RequestEndTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=p[^]]*\])'), 'l=([^,\]]+)') as Peer_TurnAroundTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=p[^]]*\])'), 'm=([^,\]]+)') as Peer_DNSLookupTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=p[^]]*\])'), 'n=([^,\]]+)') as Peer_GeoInfo,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=p[^]]*\])'), 'o=([^,\]]+)') as Peer_ASN,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=c[^]]*\])'), 'a=([^,\]]+)') as Parent_IP,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=c[^]]*\])'), 'b=([^,\]]+)') as Parent_RequestID,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=c[^]]*\])'), 'k=([^,\]]+)') as Parent_RequestEndTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=c[^]]*\])'), 'l=([^,\]]+)') as Parent_TurnAroundTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=c[^]]*\])'), 'm=([^,\]]+)') as Parent_DNSLookupTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=c[^]]*\])'), 'n=([^,\]]+)') as Parent_GeoInfo,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=c[^]]*\])'), 'o=([^,\]]+)') as Parent_ASN,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=w[^]]*\])'), 'a=([^,\]]+)') as CloudWrapper_IP,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=w[^]]*\])'), 'b=([^,\]]+)') as CloudWrapper_RequestID,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=w[^]]*\])'), 'k=([^,\]]+)') as CloudWrapper_RequestEndTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=w[^]]*\])'), 'l=([^,\]]+)') as CloudWrapper_TurnAroundTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=w[^]]*\])'), 'm=([^,\]]+)') as CloudWrapper_DNSLookupTime,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=w[^]]*\])'), 'n=([^,\]]+)') as CloudWrapper_GeoInfo,
extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\[[^[]*c=w[^]]*\])'), 'o=([^,\]]+)') as CloudWrapper_ASN,
* FROM {STREAM}

We end up with the following information indexed into Hydrolix:

{
  "CloudWrapper_ASN": null,
  "CloudWrapper_DNSLookupTime": null,
  "CloudWrapper_GeoInfo": "",
  "CloudWrapper_IP": "",
  "CloudWrapper_RequestEndTime": null,
  "CloudWrapper_RequestID": "",
  "CloudWrapper_TurnAroundTime": null,
  "Edge_ASN": null,
  "Edge_DNSLookupTime": "",
  "Edge_GeoInfo": "",
  "Edge_IP": "6.7.8.9",
  "Edge_RequestEndTime": 0,
  "Edge_RequestID": "",
  "Edge_TurnAroundTime": 1,
  "Origin_ASN": null,
  "Origin_DNSLookupTime": "",
  "Origin_GeoInfo": "",
  "Origin_IP": "",
  "Origin_RequestEndTime": "",
  "Origin_RequestID": "",
  "Origin_TurnAroundTime": "",
  "Parent_ASN": null,
  "Parent_DNSLookupTime": null,
  "Parent_GeoInfo": "",
  "Parent_IP": "",
  "Parent_RequestEndTime": null,
  "Parent_RequestID": "",
  "Parent_TurnAroundTime": null,
  "Peer_ASN": null,
  "Peer_DNSLookupTime": null,
  "Peer_GeoInfo": "",
  "Peer_IP": "",
  "Peer_RequestEndTime": null,
  "Peer_RequestID": "",
  "Peer_TurnAroundTime": null,
  "latency": 96,
  "version": 1,
  "customField": "any-custom-value",
  "reqId": "1239f220",
  "reqTimeSec": "2019-11-15 17:46:40",
  "city": "HERNDON",
  "state": "Virginia",
  "country": "US",
  "cacheStatus": "1",
  "breadcrumbs": "[a=6.7.8.9,c=g,k=0,l=1]",
  "dnsLookupTimeMSec": "50",
  "transferTimeMSec": "125",
  "turnAroundTimeMSec": "11",
  "errorCode": "ERR_ACCESS_DENIED|fwd_acl",
  "reqEndTimeMSec": "3",
  "xForwardedFor": "1.2.3.4",
  "maxAgeSec": "3600",
  "referer": "https://test.referrer.net/en-US/docs/Web/test",
  "range": "37334-42356",
  "cookie": "cookie-content",
  "accLang": "en-US",
  "reqPort": "443",
  "reqPath": "/path1/path2/file.ext",
  "reqMethod": "GET",
  "reqHost": "test.hostname.net",
  "proto": "HTTPS",
  "statusCode": "206",
  "cliIP": "4.5.6.7",
  "cp": "123456",
  "bytes": "4995",
  "rspContentLen": "5000",
  "rspContentType": "text/html",
  "UA": "Mozilla/5.0+(Macintosh;+Intel+Mac+OS+X+10_14_3)",
  "tlsOverheadTimeMSec": "0",
  "tlsVersion": "TLSv1",
  "objSize": "484",
  "uncompressedSize": "484",
  "overheadBytes": "232",
  "totalBytes": "0",
  "queryStr": "param=value"
}

Full example

To use this example you can refer to the VSCode blog example:

Akamai Transform example

### Global variable to replace with your own needs
@host = host
@projectname = sample
@tablename = akamai_logs
@transformname = akamai_sql_transform
@username = "user@company.com"
@password = "xxxxxxx"

@base_url = https://{{host}}.hydrolix.live/config/v1/
@post_endpoint = https://{{host}}.hydrolix.live/ingest/event
@sql_endpoint = https://{{host}}.hydrolix.live/query/

### Login authentication get access token  and UUID Org variable
# @name login
POST {{base_url}}login
Content-Type: application/json

{
    "username": {{username}},
    "password": {{password}}
}
### Store, parse the login response body to store the access token and organization id
@access_token = {{login.response.body.auth_token.access_token}}
@org_id = {{login.response.body.orgs[0].uuid}}


### Create a new project using the variable {{projectname}}
# @name new_project
POST  {{base_url}}orgs/{{org_id}}/projects/
Authorization: Bearer {{access_token}}
Content-Type: application/json

{
    "name": "{{projectname}}",
    "org": "{{org_id}}"
}
### Store, parse project ID from response
@projectid = {{new_project.response.body.uuid}}

### Create a new table named {{tablename}} in the {{projectname}}
# @name new_table
POST  {{base_url}}orgs/{{org_id}}/projects/{{projectid}}/tables/
Authorization: Bearer {{access_token}}
Content-Type: application/json

{
    "name": "{{tablename}}",
    "project": "{{projectid}}",
    "description": "Sample Table Akamai Logs",
    "settings": {
        "merge": {
            "enabled": true
        },
        "autoingest": {
            "enabled": false
        }
    }
}
### Store, parse table ID from response
@tableid = {{new_table.response.body.uuid}}

#### Creates a a transform for the json format and upload to our table
# @name new_transform
POST {{base_url}}orgs/{{org_id}}/projects/{{projectid}}/tables/{{tableid}}/transforms/
Authorization: Bearer {{access_token}}
Content-Type: application/json

{
    "name": "{{transformname}}",
    "description": "Demo Akamai Transform",
    "type": "json",
    "settings": {
       "is_default": true,
       "sql_transform": "SELECT now() - reqTimeSec as latency, decodeURLComponent(assumeNotNull(UA)) as UA, decodeURLComponent(assumeNotNull(referer)) as referer, extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '\\\/\\\/BC\\\/(\\S*)') as breadcrumbs, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=g[^]]*\\])'), 'a=([^,\\]]+)') as Edge_IP, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=g[^]]*\\])'), 'b=([^,\\]]+)') as Edge_RequestID, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=g[^]]*\\])'), 'k=([^,\\]]+)') as Edge_RequestEndTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=g[^]]*\\])'), 'l=([^,\\]]+)') as Edge_TurnAroundTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=g[^]]*\\])'), 'm=([^,\\]]+)') as Edge_DNSLookupTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=g[^]]*\\])'), 'n=([^,\\]]+)') as Edge_GeoInfo, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=g[^]]*\\])'), 'o=([^,\\]]+)') as Edge_ASN, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=o[^]]*\\])'), 'a=([^,\\]]+)') as Origin_IP, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=o[^]]*\\])'), 'b=([^,\\]]+)') as Origin_RequestID, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=o[^]]*\\])'), 'k=([^,\\]]+)') as Origin_RequestEndTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=o[^]]*\\])'), 'l=([^,\\]]+)') as Origin_TurnAroundTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=o[^]]*\\])'), 'm=([^,\\]]+)') as Origin_DNSLookupTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=o[^]]*\\])'), 'n=([^,\\]]+)') as Origin_GeoInfo, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(\\[[^[]*c=o[^]]*\\])'), 'o=([^,\\]]+)') as Origin_ASN, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=p[^]]*\\])'), 'a=([^,\\]]+)') as Peer_IP, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=p[^]]*\\])'), 'b=([^,\\]]+)') as Peer_RequestID, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=p[^]]*\\])'), 'k=([^,\\]]+)') as Peer_RequestEndTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=p[^]]*\\])'), 'l=([^,\\]]+)') as Peer_TurnAroundTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=p[^]]*\\])'), 'm=([^,\\]]+)') as Peer_DNSLookupTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=p[^]]*\\])'), 'n=([^,\\]]+)') as Peer_GeoInfo, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=p[^]]*\\])'), 'o=([^,\\]]+)') as Peer_ASN, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=c[^]]*\\])'), 'a=([^,\\]]+)') as Parent_IP, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=c[^]]*\\])'), 'b=([^,\\]]+)') as Parent_RequestID, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=c[^]]*\\])'), 'k=([^,\\]]+)') as Parent_RequestEndTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=c[^]]*\\])'), 'l=([^,\\]]+)') as Parent_TurnAroundTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=c[^]]*\\])'), 'm=([^,\\]]+)') as Parent_DNSLookupTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=c[^]]*\\])'), 'n=([^,\\]]+)') as Parent_GeoInfo, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=c[^]]*\\])'), 'o=([^,\\]]+)') as Parent_ASN, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=w[^]]*\\])'), 'a=([^,\\]]+)') as CloudWrapper_IP, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=w[^]]*\\])'), 'b=([^,\\]]+)') as CloudWrapper_RequestID, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=w[^]]*\\])'), 'k=([^,\\]]+)') as CloudWrapper_RequestEndTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=w[^]]*\\])'), 'l=([^,\\]]+)') as CloudWrapper_TurnAroundTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=w[^]]*\\])'), 'm=([^,\\]]+)') as CloudWrapper_DNSLookupTime, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=w[^]]*\\])'), 'n=([^,\\]]+)') as CloudWrapper_GeoInfo, extract(extract(decodeURLComponent(assumeNotNull(breadcrumbs)), '(,\\[[^[]*c=w[^]]*\\])'), 'o=([^,\\]]+)') as CloudWrapper_ASN, * FROM {STREAM}",
       "output_columns": [
          {
             "name": "version",
             "datatype": {
                "type": "uint8"
             }
          },
          {
             "name": "customField",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "reqId",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "reqTimeSec",
             "datatype": {
                "type": "epoch",
                "format": "s",
                "primary": true,
                "resolution": "s"
             }
          },
          {
             "name": "city",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "state",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "country",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "cacheStatus",
             "datatype": {
                "type": "boolean"
             }
          },
          {
             "name": "breadcrumbs",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "dnsLookupTimeMSec",
             "datatype": {
                "type": "uint32"
             }
          },
          {
             "name": "transferTimeMSec",
             "datatype": {
                "type": "uint32"
             }
          },
          {
             "name": "turnAroundTimeMSec",
             "datatype": {
                "type": "uint32"
             }
          },
          {
             "name": "errorCode",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "reqEndTimeMSec",
             "datatype": {
                "type": "uint32"
             }
          },
          {
             "name": "xForwardedFor",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "maxAgeSec",
             "datatype": {
                "type": "uint64"
             }
          },
          {
             "name": "referer",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "range",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "cookie",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "accLang",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "reqPort",
             "datatype": {
                "type": "uint32"
             }
          },
          {
             "name": "reqPath",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "reqMethod",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "reqHost",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "proto",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "statusCode",
             "datatype": {
                "type": "uint32"
             }
          },
          {
             "name": "cliIP",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "cp",
             "datatype": {
                "type": "uint32"
             }
          },
          {
             "name": "bytes",
             "datatype": {
                "type": "uint64"
             }
          },
          {
             "name": "rspContentLen",
             "datatype": {
                "type": "uint64"
             }
          },
          {
             "name": "rspContentType",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "UA",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "tlsOverheadTimeMSec",
             "datatype": {
                "type": "uint32"
             }
          },
          {
             "name": "tlsVersion",
             "datatype": {
                "type": "string"
             }
          },
          {
             "name": "objSize",
             "datatype": {
                "type": "uint64"
             }
          },
          {
             "name": "uncompressedSize",
             "datatype": {
                "type": "uint64"
             }
          },
          {
             "name": "overheadBytes",
             "datatype": {
                "type": "uint64"
             }
          },
          {
             "name": "totalBytes",
             "datatype": {
                "type": "uint64"
             }
          },
          {
             "name": "queryStr",
             "datatype": {
                "type": "string"
             }
          },
                {
        "name": "Edge_IP",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Edge_RequestID",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Edge_RequestEndTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Edge_TurnAroundTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Edge_DNSLookupTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Edge_GeoInfo",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Edge_ASN",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Origin_IP",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Origin_RequestID",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Origin_RequestEndTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Origin_TurnAroundTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Origin_DNSLookupTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Origin_GeoInfo",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Origin_ASN",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Parent_IP",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Parent_RequestID",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Parent_RequestEndTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Parent_TurnAroundTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Parent_DNSLookupTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Parent_GeoInfo",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Parent_ASN",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "CloudWrapper_IP",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "CloudWrapper_RequestID",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "CloudWrapper_RequestEndTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "CloudWrapper_TurnAroundTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "CloudWrapper_DNSLookupTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "CloudWrapper_GeoInfo",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "CloudWrapper_ASN",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Peer_IP",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Peer_RequestID",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Peer_RequestEndTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Peer_TurnAroundTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Peer_DNSLookupTime",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Peer_GeoInfo",
        "datatype": {
          "type": "string",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "Peer_ASN",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      },
      {
        "name": "latency",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      }
        ],
        "compression": "gzip",
        "format_details": {
            "flattening": {
                "active": false
            }
        }
    }
}

As you can see in the transform we have some new column which are generated by the sql transform:

      {
        "name": "latency",
        "datatype": {
          "type": "uint32",
          "source": {
            "from_input_field": "sql_transform"
          }
        }
      }

The source for latency is from_input_field: sql_transform which is where the data is generated.

We can add many more column generated from the Akamai raw data, we can extract enrich and go even further with custom function and dictionary!
Let’s see it in action with CMCD (Common Media Client Data).

Share Now