1. Linked Statistical Data Analysis

    SFA Séminaire Linked Data, Bern, 2014-01-29

    #LinkedData #lodch

    Sarven's avatar Sarven Capadisli http://csarven.ca/#i @csarven

  2. What and Why?

    Fun and profit

  3. Statistical Data on the Web (Characteristics)

    • Heterogeneous
    • Decentralized
    • Structured
    • High volume
    • Formats (e.g., CSV, Excel, PC-Axis, SDMX-ML, XML)

    Clean? Synchronised? Comparable? Provenance? Trustable? Analyses?

  4. A Linked Dataspace

    from Statistical Linked Dataspaces

  5. Statistical Linked Dataspaces (2010-2011)

  6. Galway City page on DataGovIE

    Screenshot of Galway City page on DataGovIE
  7. School Explorer

    School Explorer Pilot screenshot
  8. Statistical Linked Dataspaces (2012)

  9. Central government debt indicator chart view for some countries

    Screenshot of World Bank indicator GC.DOD.TOTL.GD.ZS for countries CA,US,DE,CH,IE in worldbank.270a.info/view

    See also: http://worldbank.270a.info/view?indicator=GC.DOD.TOTL.GD.ZS&country=CA,US,DE,CH,IE

  10. Statistical Linked Dataspaces (2013)

    Source format? SDMX-ML

  11. Statistical Linked Dataspaces (2014+)

  12. 270a Cloud

  13. Interesting queries?

    • Number of people born in Bern before 1900
    • Inflation rate in Italy when the prime minister was ...
    • Development projects in low-middle income countries situated above the equator
  14. How about interesting analysis?

    • statistically significant analysis about GDP and mortality-rate
    • strong correlations
    • predicting or forecasting possible outcomes
    • Investigating the WHYs
  15. stats.270a.info

    Citizen-centric interfaces for statistical stuff.

    Intended for data journalists, researchers, non-developers!

    ... and Linked Data friendly.

  16. Analysis user-interface (Plot) 1/3

    http://stats.270a.info/analysis/worldbank:SP.DYN.IMRT.IN/transparency:CPI2009/year:2009

  17. Analysis user-interface (Summary) 2/3

    http://stats.270a.info/analysis/worldbank:SP.DYN.IMRT.IN/transparency:CPI2009/year:2009

  18. Oh yeah?

    Provenance user-interface

  19. Analysis user-interface (Provenance) 3/3

    http://stats.270a.info/provenance/fa698e46868fe348865678884e89ef84b0be6c64

  20. stats.270a.info Toolkit

    • Shiny server (node)
    • R (Shiny, SPARQL packages)
    • Jena Fuseki
    • Apache
    • Linked Data Pages
  21. So What?

    • Strengthening trust
    • Better data journalism?
    • Discovery of interesting correlations
    • Uncovering insights, making predictions, ... decisions
    • Bulk pre-analysis
    • Production of new statistical artefacts
  22. Consider the following

  23. Identifying things

    Now! That should clear up a few things around here. The Far Side
  24. Linked Statistical Artefacts

    • Dataset: http://worldbank.270a.info/dataset/world-bank-finances
    • Observation: http://ecb.270a.info/dataset/SEE/A/AT/WBR0/EXT/X/E/2011
    • Dimension: http://oecd.270a.info/property/TIME
    • Measure: http://ecb.270a.info/property/OBS_VALUE
    • Attribute: http://transparency.270a.info/classification/attribute/matching-percentiles
    • Concept: http://imf.270a.info/code/1.0/CL_AREA/CH
    • Code list: http://fao.270a.info/code/0.1/CL_UN_COUNTRY
    • Hierarchical code list: http://bfs.270a.info/code/1.0/HR_HGDE_HIST
    • Regression Analysis: http://stats.270a.info/analysis/worldbank:GC.DOD.TOTL.GD.ZS/transparency:CPI2009/year:2009

    Cool URIs? 1, 5, 100, 10000 years? Ha!

  25. What will be your artefacts?

  26. Consider the following

    • Artefacts, Artefacts, Artefacts
    • Citizen-centric interfaces
    • Provenance
    • Discoverability
    • Comparability
  27. Linked Statistical Data Analysis

    Sarven's avatar Sarven Capadisli

    http://csarven.ca/#i

    @csarven

  28. Credits

  29. Credits