Web Tripping
Sarven Capadisli
Web Tripping
BFH Web Tripping , Bern, 2015-04-13 #Linked Data #BFH

Are you in the right room?
Lecture Stuff
- Notes and exercises: http://wbtc.bfh.ch/2015/linked-data
- Important stuff: Design Issues: Architectural and philosophical points
- Email: sarven.capadisli@bfh.ch
Lecture Overview
- Web / Linked Data
- HTTP, URI (the fundamentals)
- RDF data model (language) and syntaxes
- SPARQL (querying for RDF)
Technology Context

Illustration by Sandro Hawke
HTTP
- An application protocol that is at the core of data communication on WWW
- Version 1.1 used mostly, 2.0 on the way
- It has Request methods (e.g., GET, POST, PUT, DELETE)
- Responses with status codes (e.g., 200 OK, 301 Moved Permanently, 303 See Other, 404 Not Found) [seeAlso cats]
HTTP Session
HTTP GET
ing a particular representation of a resource:- Includes HTTP
Accept
headers when requesting, HTTPContent-type
in response
Content negotiation
http://dbpedia.org/resource/Switzerland
has no information on its nature. Can it return HTML, RDF?
- Request: I accept HTML and RDF/XML, prefer HTML
- Response: Okay, go to
http://dbpedia.org/page/Switzerland
- Request: I accept HTML and RDF/XML, prefer HTML
- Response: Okay, I have an HTML representation
Dereferencing a URI to HTML
curl -iLH "Accept: text/html" http://dbpedia.org/resource/Switzerland
(requesting an HTML response)curl -iLH "Accept: text/html;q=0.9, application/rdf+xml;q=0.8" http://dbpedia.org/resource/Switzerland
(preferring an HTML response)
Dereferencing a URI to RDF
curl -iLH "Accept: application/rdf+xml" http://dbpedia.org/resource/Switzerland
(requesting an RDF/XML response)rapper -i rdfxml http://dbpedia.org/resource/Switzerland
URIs / IRIs
URIs / IRIs
- A URI identifies (refers to or names) a web resource
- They are unique and essentially represents some thing (e.g., document, concept)
- IRI (Unicode/ISO 10646) is a generalization of URI (ASCII character set)
URI Syntax
<scheme name> : <hierarchical part> [ ? <query> ] [ # <fragment> ]
More at: http://en.wikipedia.org/wiki/URI_scheme
See also RFC 2396
Example IRIs
http://www.ietf.org/rfc/rfc2396.txt
http://en.wikipedia.org/wiki/Neuchâtel
mailto:John.Doe@example.com
ftp://ftp.is.co.za/rfc/rfc1808.txt
ldap://[2001:db8::7]/c=GB?objectClass?one
Example IRIs
news:comp.infosystems.www.servers.unix
irc://irc.freenode.net/csarven,isnick
telnet://melvyl.ucop.edu/
gopher://spinaltap.micro.umn.edu/00/Weather/California/Los%20Angeles
urn:uuid:96550970-f26f-41f0-9dbd-db4c0d522889
Example IRIs
urn:isbn:9780812696110
doi:10.1000/182
bitcoin:1H67NnpSGAUrSRA5SkPHMmqNqHqpXHuFGp
HTTP URI design patterns in the wild
http://{lang}.wikipedia.org/wiki/{Article}
http://dbpedia.org/resource/{Article}
http://dbpedia.org/property/{id}
https://twitter.com/{username}
http://reddit.com/r/{subreddit}
HTTP URI design patterns in the wild
http://imgur.com/gallery/{hash}
http://www.flickr.com/photos/{username}/{id}
http://delicious.com/tag/{id}
https://csarven.ca/#i
http://worldbank.270a.info/dataset/{id}
URI design patterns in the wild
http://creativecommons.org/licenses/{type}/{version}/
http://moodle.bfh.ch/course/view.php?id={id}
http://www.wirtschaft.bfh.ch/{lang}/{degree}/{id}.html
{firstname}.{lastname}@student.bfh.ch
Cool URIs?
- Dedicated service (1, 5, 100 years?)
- Consistent patterns
- Re-use existing identifiers
- Link multiple-representations
Cool URIs?
- Avoid ownership, versions (usually), auto-increment
- Avoid query strings, file extensions
A must read is TimBL's Cool URIs don't change
See also 10 Rules for Persistent URIs (see it as a rough guideline, not musts)
Tools and services
- Command-line analysis: Raptor utils, or curl
- Linked Data browsers (e.g., Tabulator)
Identifying things

Identifying things
Data
- Data is everywhere (personal, government, health, events..)
- Uncovering insights
- Predictions
- Making decisions (e.g., where to save energy)
- Smarter systems
Data: What is it good for?
Absolutely everything!
- Understanding human societies
- Health conditions
- Stable economies
- How do/should things work?
Technology Flow

Core idea
- Structured data available on a global scale
- Connect related data items across multiple sources
- ?
- Profit
Why Linked Data?
- Classical data management vs. distributed Web
- Development using standards
- Open things up
- Gradual and sustainable
Linked Data Design Principles
- URIs as names for things
- HTTP URIs so that people can look up those names
- When someone looks up a URI, provide useful information, using RDF*, SPARQL
- Include links to other URIs for discovery
http://www.w3.org/DesignIssues/LinkedData.html
Linked Open Data
Linked Open Data Cloud

Wikipedia to DBpedia

A grain of rice

Linked Data Life Cycles
- Original data owners
- Data publishers
- Data enrichment parties
- Data consumers
RDF
RDF
- Essentially a language. A Graph model
- Similar to Entity-Attribute-Value (EAV) data model e.g., Sarven.height = 1.65m
- To describe concepts and their relationships (many graphs)
- Names for resources (URIs) e.g., https://csarven.ca/#i and http://dbpedia.org/property/height
Human languages
How do we express ourselves?
- Sentences: subjects and predicates (verbs), and sometimes objects
Body languagemeh ;)
Vocabularies
- Decentralized invention (anyone can make theirs)
- Basics: RDF Syntax, RDF Schema
- Definitions for the relationship between things, or their classifications e.g.,Friend of a Friend (FOAF), Dublin Core Terms
- RDF Data Cube: Data structure definitions, code lists, datasets, ..
- SKOS: Code lists, and concepts can be reused
Vocabularies
- SIOC: vocabulary for online communities
- Schema.org: various - Google/Bing/Yahoo/Yandex initiative
- Open Graph Protocol: webpages as part of rich objects in social graph - Facebook
What about OWL?
- Meant to be an Ontology Language for the Web
- Very powerful: rich semantic meaning
- .. but it can also be painful (PITA)
- Need domain experts
- Usually need to write domain specific applications
N-Triples
Sarven's height is 1.65m.
<https://csarven.ca/#i>
<http://dbpedia.org/property/height>
"1.65m" .
N-Triples
Bern is a city.
<http://dbpedia.org/resource/Bern>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/ontology/City> .
Turtle
Sarven is interested in electronic music and monkeys:
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix wikipedia: <http://en.wikipedia.org/wiki/> .
Turtle
<https://csarven.ca/#i>
rdf:type foaf:Person ;
foaf:givenName "Sarven"@en ;
foaf:interest wikipedia:Electronica , wikipedia:Monkey ;
foaf:mbox <mailto:info@csarven.ca> ;
foaf:account <https://twitter.com/csarven> .
Turtle
Switzerland has capital Bern. Bern is a City.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix dbr: <http://dbpedia.org/resource/> .
@prefix dbp: <http://dbpedia.org/property/> .
@prefix dbo: <http://dbpedia.org/ontology/> .
@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .
Turtle
dbr:Switzerland dbp:capital dbr:Bern .
dbr:Bern
rdf:type dbo:City ;
geo:lat "46.950001"^^xsd:float ;
geo:long "7.439583"^^xsd:float .
RDFa example
<p about="https://csarven.ca/#i">
Sarven is
<span property="dbp:height">1.65m</span></p>
<p about="dbr:Bern">
typeof="dbo:City">Bern is a city.</p>
<p about="https://csarven.ca/#i">
Sarven is in <a rel="foaf:based_near"
href="dbr:Bern">Bern</a>.</p>
RDF/XML
<rdf:RDF xmlns:rdf="
http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="
http://dbpedia.org/resource/Bern">
<rdf:type rdf:resource="
http://dbpedia.org/ontology/City">
</rdf:Description>
</rdf:RDF>
RDF/XML
<rdf:Description rdf:about="
http://dbpedia.org/ontology/City">
<rdf:type rdf:resource="
http://www.w3.org/2002/07/owl#Class">
<rdfs:label xml:lang="en">City</rdfs:label>
<rdfs:label xml:lang="de">Stadt</rdfs:label>
</rdf:Description>
Overview of the RDF formats
- Turtle: compact, human readable, triple pattern in SPARQL
- N-Triples: verbose, small footprint (inspection, processing), exchange
- RDF/XML: verbose, existing toolchains, tree model
- RDFa: (X)HTML page, for humans and machines
- JSON-LD: usually client-side dev, tree model
SPARQL Protocol and RDF Query Language
- A way to query for triple patterns (as opposed to searching)
- SPARQL endpoints (query public datasets)
- Federated queries (merge patterns from different source)
- See also: SPARQL 1.1 Query and SPARQL 1.1 Federated Query specifications.
SPARQL query example
Select a list of subjects that is a city.
SELECT ?city
WHERE {
?city a <http://dbpedia.org/ontology/City> .
}
SPARQL query result example
city |
---|
http://dbpedia.org/resource/Bern |
http://dbpedia.org/resource/Montreal |
... |
SPARQL query example
People who were born in Bern before 1900:
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT ?name ?birth ?person WHERE {
?person dbo:birthPlace dbr:Bern .
?person dbo:birthDate ?birth .
?person foaf:name ?name .
FILTER (?birth < "1900-01-01"^^xsd:date) .
}
ORDER BY ?name
See results from DBpedia SPARQL Endpoint.
SPARQL query
List of project names in countries which are classified to have low to middle income that are situated above the equator:
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX wgs: <http://www.w3.org/2003/01/geo/wgs84_pos#>
PREFIX property: <http://worldbank.270a.info/property/>
PREFIX country: <http://worldbank.270a.info/classification/country/>
SPARQL query
PREFIX income-level: <http://worldbank.270a.info/classification/income-level/>
PREFIX graph: <http://worldbank.270a.info/graph/>
SPARQL query
SELECT ?projectLabel ?countryLabel WHERE {
GRAPH graph:meta {
?country property:income-level income-level:LMC .
?country wgs:lat ?latitude .
FILTER (?latitude > 0)
?country skos:prefLabel ?countryLabel . }
SPARQL query
GRAPH graph:world-bank-projects-and-operations {
?project property:country ?country ;
skos:prefLabel ?projectLabel . }
} ORDER BY ?projectLabel
SPARQL query: Federated
CONSTRUCT { .. } WHERE {
{ SERVICE <http://data.linkedmdb.org/sparql> {
SELECT * WHERE { .. } }
}
UNION
{ SERVICE <http://dbpedia.org/sparql> {
SELECT * WHERE { .. } }
}
Tools and services
- Command-line analysis: Raptor utils, or curl
- SPARQL server and endpoint: Fuseki
- Query SPARQL endpoints: DBpedia, World Bank, .. many more, and the status of the ones in the LOD cloud.
- Linked Data browsers (e.g., Tabulator)
Tools and services
- Search for vocabularies in Linked Open Vocabularies (LOV)
- W3C validator
- See what's out there: LOD Cloud
Web Tripping
BFH , Web Tripping, Bern, 2015-04-13 #LinkedData #BFH

Credits
- Slides using Shower Presentation Engine and homecooked RDFa