File:Gaussianprocess SnowboardTrend.svg

Summary

Description
English: Application of gaussian process regression to google trend statistic for search expression Snowboard.
Date
Source Own work
Author Physikinger
SVG development
InfoField
Source code
InfoField

Python code

#This source code is public domain
#Author: Christian Schirm 
import numpy, scipy.spatial
import matplotlib.pyplot as plt

# Data source: https://www.google.de/trends/explore?date=all&q=Snowboard
x = numpy.array([ 2004.08,  2004.17,  2004.25,  2004.33,  2004.42,  2004.50,  2004.58,
        2004.67,  2004.75,  2004.83,  2004.92,  2005.00,  2005.08,  2005.17,  2005.25,
        2005.33,  2005.42,  2005.50,  2005.58,  2005.67,  2005.75,  2005.83,  2005.92,
        2006.00,  2006.08,  2006.17,  2006.25,  2006.33,  2006.42,  2006.50,  2006.58,
        2006.67,  2006.75,  2006.83,  2006.92,  2007.00,  2007.08,  2007.17,  2007.25,
        2007.33,  2007.42,  2007.50,  2007.58,  2007.67,  2007.75,  2007.83,  2007.92,
        2008.00,  2008.08,  2008.17,  2008.25,  2008.33,  2008.42,  2008.50,  2008.58,
        2008.67,  2008.75,  2008.83,  2008.92,  2009.00,  2009.08,  2009.17,  2009.25,
        2009.33,  2009.42,  2009.50,  2009.58,  2009.67,  2009.75,  2009.83,  2009.92,
        2010.00,  2010.08,  2010.17,  2010.25,  2010.33,  2010.42,  2010.50,  2010.58,
        2010.67,  2010.75,  2010.83,  2010.92,  2011.00,  2011.08,  2011.17,  2011.25,
        2011.33,  2011.42,  2011.50,  2011.58,  2011.67,  2011.75,  2011.83,  2011.92,
        2012.00,  2012.08,  2012.17,  2012.25,  2012.33,  2012.42,  2012.50,  2012.58,
        2012.67,  2012.75,  2012.83,  2012.92,  2013.00,  2013.08,  2013.17,  2013.25,
        2013.33,  2013.42,  2013.50,  2013.58,  2013.67,  2013.75,  2013.83,  2013.92,
        2014.00,  2014.08,  2014.17,  2014.25,  2014.33,  2014.42,  2014.50,  2014.58,
        2014.67,  2014.75,  2014.83,  2014.92,  2015.00,  2015.08,  2015.17,  2015.25,
        2015.33,  2015.42,  2015.50,  2015.58,  2015.67,  2015.75,  2015.83,  2015.92,
        2016.00,  2016.08,  2016.17,  2016.25,  2016.33,  2016.42,  2016.50,  2016.58])
y = numpy.array([ 100.,   75.,   44.,   24.,   18.,   17.,   19.,   26.,   37.,
         57.,   77.,   95.,   84.,   70.,   43.,   21.,   16.,   15.,
         18.,   24.,   33.,   50.,   70.,   94.,   78.,   80.,   43.,
         21.,   14.,   13.,   15.,   22.,   31.,   46.,   61.,   72.,
         60.,   49.,   28.,   15.,   11.,   11.,   13.,   17.,   23.,
         33.,   50.,   68.,   58.,   44.,   27.,   14.,   10.,   10.,
         12.,   16.,   22.,   31.,   46.,   66.,   61.,   44.,   26.,
         13.,   10.,   11.,   12.,   16.,   21.,   31.,   39.,   56.,
         56.,   65.,   28.,   13.,   10.,    9.,   10.,   13.,   17.,
         24.,   37.,   57.,   44.,   30.,   19.,   10.,    7.,    8.,
          9.,   11.,   14.,   20.,   29.,   37.,   36.,   30.,   15.,
         10.,   10.,    8.,    8.,    9.,   12.,   16.,   23.,   34.,
         34.,   26.,   15.,    7.,    5.,    5.,    6.,    7.,   10.,
         14.,   22.,   31.,   28.,   42.,   14.,    6.,    5.,    4.,
          5.,    7.,    8.,   11.,   18.,   25.,   27.,   21.,   11.,
          5.,    4.,    4.,    5.,    6.,    7.,   10.,   16.,   21.,
         27.,   18.,   10.,    6.,    4.,    4.,    4.])

x_known = x
y_known = numpy.log(y)
x_unknown = numpy.arange(2016.5,2023,1/12.)
def covFunc(d):
    return 0.8*numpy.exp(-numpy.abs(numpy.sin(numpy.pi*d))/0.5  -numpy.abs(d/25.)**2 - 2.5) + \
        (0.2-0.01)*numpy.exp(-(numpy.abs(numpy.sin(numpy.pi*d/4))/0.2)) + 0.01*numpy.exp(-numpy.abs(d/45.))

def covMat(x1, x2, covFunc, noise=0):
    cov = covFunc(scipy.spatial.distance_matrix(numpy.atleast_2d(x1).T, numpy.atleast_2d(x2).T))
    if noise: numpy.fill_diagonal(cov, numpy.diag(cov) + noise)
    return cov

Ckk = covMat(x_known, x_known, covFunc, noise=0.02)
Cuu = covMat(x_unknown, x_unknown, covFunc, noise=0.00)
CkkInv = numpy.linalg.inv(Ckk)
Cuk = covMat(x_unknown, x_known, covFunc, noise=0)
m = numpy.mean(y_known)
y_unknown = m + numpy.dot(numpy.dot(Cuk,CkkInv), y_known - m)
sigmaPrior = numpy.sqrt(numpy.mean(numpy.square(y_known)))
sigma = sigmaPrior*numpy.sqrt(numpy.diag(Cuu - numpy.dot(numpy.dot(Cuk,CkkInv),Cuk.T)))

fig = plt.figure(figsize=(6,3), dpi=100)
plt.plot(x,y,'-')
plt.plot(x_unknown,numpy.exp(y_unknown),'r-')
plt.fill_between(x_unknown, numpy.exp(y_unknown - sigma), numpy.exp(y_unknown + sigma), color = '0.85')
plt.xlim(2004,2022.5)
plt.xticks(numpy.arange(2004,2023,2))
plt.ylim(0,100)
plt.vlines([2016.5], 0, 100,'0.6','--')
plt.title('Google-Trend zum Suchbegriff "Snowboard"')
plt.ylabel('Suchanfragen pro Monat (%)')
plt.savefig('Gaussianprocess_SnowboardTrend.svg')

Licensing

I, the copyright holder of this work, hereby publish it under the following license:
Creative Commons CC-Zero This file is made available under the Creative Commons CC0 1.0 Universal Public Domain Dedication.
The person who associated a work with this deed has dedicated the work to the public domain by waiving all of their rights to the work worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law. You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.

Category:CC-Zero#Gaussianprocess%20SnowboardTrend.svgCategory:Self-published work
Category:Gaussian processes Category:Google Trends
Category:CC-Zero Category:Gaussian processes Category:Google Trends Category:Self-published work Category:Valid SVG created with Matplotlib code