Data Warehousing for Google Analytics Data


How it works

Needl uses the Google Analytics API to extract and enhance your data and then store it in BigQuery - Google's cloud data warehousing solution. An initialization will go back in time to build up your data and then make daily updates.

The Google Analytics API limits the number of dimensions requested in a single query, Needl overcomes this by sending off multiple queries and stitching the results together.

BigQuery connects well with analytics tools such as Data Studio and Google’s ML platform. Give your data a proper home!


Features:

  • Session level data rather than aggregated - daily updates.
  • 25 dimensions as standard - see typical schema below.
  • Sampling can be avoided - run reports across date ranges and as long as there is less than 500K sessions per day, no sampling involved (limit may be lifted in future versions).
  • Bucketed dimensions to aid analytics:
    • device = deviceCategory + operatingSystem + browser
    • daypart = combined dayOfWeek and hour
  • "Has visited" dimensions - has the user visited a url or zone (fuzzy matched page path) during a session.
  • Add in your own data - if you can key your data by say location and time, you should be able to join with the extracted data
  • Support for client id if harvested and fed back into GA via custom dimension (how to)
  • Access to further dimensions if client id used - client id allows us to pull more data from GA, including custom dimensions.


See it in action

Example dataset in BigQuery:

https://console.cloud.google.com/bigquery?project=needlbq&p=needlbq&d=public&t=demo&page=table


Typical Schema (from Needl version 4 currently in beta)

Fields not available in Google Analytics are shown with a description.

Dimension/MetricDescription
time.utcDateTimeUTC date/time
time.dateTimeServer (Analytics) date/time
time.hour
time.dayOfWeekName
time.workingDay
time.dayPartCombined day of week/hour
time.userDayOfWeekNameUser’s local day of week
time.userDayPartUsers local daypart
time.userHourUser’s local hour
time.dayLightUser’s local daylight – light/dark/twilight
geoNetwork.cityId
geoNetwork.city
geoNetwork.region
geoNetwork.metro
geoNetwork.metroRegionCombined metro/region
geoNetwork.country
geoNetwork.longitude
geoNetwork.latitude
geoNetwork.timezoneTimezone of the user
geoNetwork.networkLocation
geoNetwork.continent
geoNetwork.subContinent
user.clientId
user.userType
user.sessionCount
user.sessionFrequency
user.sinceSessionBucket
user.weeksSinceLastSession
user.sessionDurationBucket
user.daysSinceLastSession
trafficSource.source
trafficSource.medium
trafficSource.trafficType
trafficSource.keyword
trafficSource.campaign
trafficSource.channelGrouping
trafficSource.fullReferrer
hits.time
hits.utcTimeStamp
hits.page
hits.page.hits
hits.page.pagePath
hits.page.pagePathBasePagepath with query string removed
hits.page.pagePathLevel1
hits.page.pagePathLevel2
hits.page.pagePathLevel3
hits.page.pagePathLevel4
hits.page.previousPagePath
weather.utcStationTimeWeather event UTC timestamp of station
weather.weatherCategoryMain weather feature (Rain/Clouds/Clear…)
weather.weatherDescriptionMore detail of weather
weather.sunriseSunrise (weather station)
weather.sunsetSunset (weather station)
weather.weatherIconIcon – can be used with url for weather image
weather.temperatureTemperature at weather station
device.osMajorVersion
device.browserMajorVersion
device.technologyCombined os/device/browser
device.deviceCategory
device.browser
device.operatingSystemVersion
device.browserVersion
device.operatingSystem
device.language
adWords.adGroup
totals.sessions
totals.bounces
totals.sessionDuration
totals.transactionRevenue
totals.transactions
totals.goal1Completions
totals.goal2Completions
totals.goal3Completions
:
totals.goal20Completions

Book a demo for free!