okieli dokieli

More details about this document
Sarven Capadisli
CC BY 4.0
Notifications Inbox
Document Status
Unique Identifier
Sarven Capadisli

dokieli is a socially-aware clientside editor for decentralised article publishing, annotations and social interactions with an ocean of open web standards at its disposal.

I made the first commit to dokieli 10 years ago. I built what I needed, and use it (still). This is the story of procrastination of epic proportions, making connections, and having fun.

After working on the StatusNet platform in 2010, I found myself back in academia pursuing a master's degree, perhaps for no particular reason other than to change my focus in life and to play with SW/LD stuff.

It became clear from early on that – ironically – academics essentially operate within a paper-centric framework to create and disseminate publicly-funded knowledge… usually by delegating it to third-party publishers which still operate in the 15th century.

This was all strange to me, as I'd assumed SW/LD researchers would be at the front lines of making academic information freely available to anyone, getting the best out of the Open Web Platform, along with everything they've been advancing. The original web developer offered the WWW project to help information sharing and designed it for a social effect – to help people work together; with hindsight, there were mostly societal challenges, rather than technical ones.

As a person from the control yourself and do it yourself school of thought, it was hard for me to just accept the status quo. Essentially, academics had to communicate their findings by crafting static PDFs based on publishers' print templates and third-party services to publish them on a website somewhere, and, most disturbingly, were pressured to waive their intellectual rights or perish. I was given the option to choose to play along with the rules written in stone, choose to ignore the elephant in the room, choose to learn TeX, choose to obey the scholarly communication industrial complex, choose to waste taxpayers' money with the privileges afforded to me, choose my future. But then I thought, why would I want to do a thing like that? I chose not to do those things. I chose something else. I chose to take myself outside the box, to test the boundaries of authority, and demonstrate what the native affordances of the web could enable for research communication. Being the underdog was a calling. The good news was that I wasn't alone in wanting to change things, and many had been working on at all levels of opening up and modernising scholarly communication for a long time. The bad news was that multinational conglomerates have been a parasite in the education system for a long time.

There was a plethora of dismissals from people at the mere idea of self-publishing one's own research (and development) using off-the-shelf web standards and not giving in to whatever so-called publishers had to offer. How dare I, a no-name junior researcher, question the practice and principles of superstar researchers? One of the initial challenges that some people raised was that the document, for reasons beyond my comprehension, needed to look and feel like an academic paper. Some even expected pixel-perfection when printed, as well as random features being well-supported in all web browsers. Challenge accepted. Some communities apparently even preferred to first read the left column and then scroll back to the top of the page and read the right column on their screen. They were stuck in a static two-dimensional space when they have an interactive hypermedia system right in front of them. This was something I had to entertain for the time being because, as I saw it, there were far more interesting things to do with documents, at the very least better user interactions based on structured units of information. I wanted to demonstrate that the research community, especially the content creators, can be URI owners for any resource of significance in their articles: problem statements, motivation, hypothesis, arguments, workflow steps, methodology, design, results, evaluation, conclusions, future challenges, as well as all inline semantic citations, to name a few! Using RDFa in markup languages was ideal and completely within reach. Why RDFa? Because! Besides, that was just plumbing considering that an authoring and publishing tool for end-users was the goal. Since HTML and CSS had been mature for such a long time in 2011, getting the view in reasonable shape wasn't so much of a challenge. I mean, CSS Zen Garden and other initiatives had widely made their point a decade before about web standards. I demonstrated HTML+RDFa+CSS that replicated well-known layouts for research documents in computer and information science, e.g., LNCS and ACM, to drive the point that academic articles via LaTeX/PDF had nothing above and beyond to offer especially when they tend to end up on the web any way. From then onwards, I tried to get the attention of the wider SW/LD community at conferences, mailing lists, and random encounters about making visible changes. Activism was already underway.

In 2012, I published my master’s thesis, Statistical Linked Dataspaces, and by incorporating everything that I've said was doable. To fulfil the academic requirements, I generated a PDF from the web browser print dialogue and submitted it along with the print copy. By that point, I was quite engaged in the Linked Data space and academic work, so I transitioned to pursuing a Ph.D. degree.

In 2013, the HTML templates and stylesheets became more prevalent as I actively used them to publish my own academic articles on my website and output a dead-weight PDF from the web browser to fulfil the demands of web conferences and journals. Of course, I still argued with the SW/LD community about the nonsense of PDFs, publishers, the way conferences are organized (and being manipulated by publishers), and everything in between. Figured the least SW/LD conferences can do is accept contributions to the field using HTML and friends on equal grounds. I coined Linked Research as part of a call for polemics to suggest that we can use the web stack and everything else that's already at our disposal, and have applications that use the data to communicate more effectively. Alongside the initiative, I put out research articles in HTML+RDFa, CSS, and initial JavaScript to exemplify publishing and interactions, and committed the source code as open source (which moved to the linkeddata GitHub organisation in 2015).

By 2014, I was investing quite a bit of time researching and enriching articles with anything I could get my hands on. After all, there was already a lot of work out there on SW vocabularies to describe units of research information. I introduced JavaScript to help generate a table of contents, show fragments, fetch Linked Data, build references, and export the HTML to file via the DOM. While there were some advancements, trying to get the whole community to come along even an inch, and for authors to consider switching their existing ways, remained challenging. I tried to make it clear that this work was not strictly for academic articles, and that a single-page application that is progressively enhanced can help create and edit any kind of article, including reviews and assessments, technical specifications, news, social media posts, and even slideshows. At this point, the project still did not have a unique name beyond the idea of it being an editor alongside the initiative to Call for Linked Research.

In early to mid-2015, more work was done towards the writing aspect using HTTP, Web Storage, and other user interactions. In mid-2015, I joined the Solid project at MIT (nowadays Solid Project), which had a high overlap with the core ideas around how clientside applications can be used to write and publish articles in a context where individuals control their own identity and resources. By that point, I was working less and less on statistical linked dataspaces and analysis towards my Ph.D., and more and more on Solid and what I was considering as “procrastination”. In mid to late 2015, with another procrastinator, we split the Linked Research initiative from the tooling and called it dokieli. It is really a made up word inspired from both Ned Flandersokily dokily! phrase, and the Orb’s Okie Dokie It's the Orb on Kompakt. Nowadays, people typo dokieli all the time, which is acceptable, and everyone pronounces it differently, which is great.

By late 2015, it was possible for the user to bring their own WebID for authentication as well as to make use of their profiles, use common editing and authoring features, perform write operations against documents, create annotations, send notifications, and perhaps most interestingly the “save as” feature to copy the document and its media to a personal storage or anywhere they were authorised to (based on Web Access Control) in Solid. That about covered the essentials to move things forward.

From 2015 to the present, Solid has played an instrumental role in situating and clarifying dokieli's scope and approach with respect to read-write operations across the Web. This has enabled content creators and readers to remain in control. dokieli remains an application within the web browser, either embedded in an HTML document or as a browser add-on, with no server-specific dependencies, desktop installations, or centralised components. It uses open web standards and takes advantage of personal storage, profiles, and other Linked Data.

In addition to the development of what is currently known as the Solid Protocol, the availability of technical specifications between 2016-2017, such as the Web Annotation Data Model, Embedding Web Annotations in HTML, Linked Data Notifications (LDN), ActivityPub, and Activity Vocabulary, to name a few, allowed some of the most significant features and interactions in dokieli to fall into place. If there was ever a doubt about its mission and potential, that all just further grounded the work on web standards, and doubled-down on interoperability. There is a complete list of specifications that dokieli implements.

It was amazing to see how a user can share a document with anyone in their contacts, which are discovered through their profile, by sending their inbox a notification. It was also possible to demonstrate how another user can annotate (e.g., motivated by assessing, bookmarking, commenting, describing, highlighting, questioning, replying, tagging) anything of interest in a document by storing it at their preferred location, sending a notification about it to the document's inbox, and making it possible for other applications to reuse the annotations. dokieli would find the document's inbox, fetch its notifications, discover the social activities and annotations about the article somewhere on the Web, and then present the annotation to the reader precisely in the context where it was originally annotated. The article, inbox, notification, annotation, and profiles can all be controlled by different parties, working with completely decentralised information on the Web.

Even in 2016, the HTML+RDFa patterns were mature and flexible enough to be used for proceedings for conferences and workshops (CEUR-WS) and even technical report (W3C). The W3C accepted the dokieli version of the LDN specification since the required CSS rendered the underlying HTML just like other specifications, and it passed all publication rules. Additionally, the LDN test suite and the underlying tool mayktso for HTTP interactions were derived from dokieli's source code, which turned out to be great return on investment.

In early 2017, I self-published a peer-reviewed article Decentralised Authoring, Annotations and Notifications for a Read-Write Web with dokieli, which currently has a mixture of meaningful and a mess of external annotations by random people testing it out. Mission accomplished.

By mid-2017, there was plenty of material available to facilitate communication about Solid and Linked Research, and people from different communities took notice when I gave workshops and talks at conferences and research labs. dokieli helped people to understand the core concepts and connect the dots. It also helped me to understand works in libraries and archives that have been around for so long, especially the notion of decoupled functions and reconstructing information flows for a unit of communication. All of this helped me tremendously to refocus my dissertation on what I had been passionate about for years.

In late 2017, another procrastinator who had been involved with the project for some time got frustrated with the codebase to the point where they decided to take it upon themselves to modularise the code. This made it easier for the code to be maintained, and attracted other contributors.

By this point, it was also possible to demonstrate how various kinds of documents can be described and linked together, for instance a technical specification, its test suite, implementation report of a project, reports summary, the project, and an article citing the specification in Linked Specifications, Test Suites, and Implementation Reports.

In 2018, I was still working on my dissertation (writing it using dokieli, of course). Much of the time went into adding features that I wanted to use as well as to further help communicate applying Solid towards Linked Research. I incorporated all kinds of stuff related to archiving, robust links, versioning resources, and improvements on the graph visualisation (and navigating through the Linked Data), and citations. Some web conferences agreed to accept research articles in HTML+RDFa perhaps due to a combination of years of activism around Linked Research, evidence supporting its feasibility, and their own initiative. However, since the conferences still had attachment issues to for-profit publishers, authors needed to provide a LaTeX version of their articles somewhere in the pipeline. I was content to take what we can get because it was proof that the discussion and considerations in the SW/LD community started to change. Some research teams got on board with the idea of Linked Research and Solid, and not only started to have HTML+RDFa based articles but they also built their own tooling. That was further proof that the ideas caught on, and it wasn't about a particular tool. There are some works that adopted dokieli and examples in the wild.

The implementation of dokieli has served as a core contribution to my dissertation in 2019, Linked Research on the Decentralised Web, demonstrating how a chain of operations can be orchestrated for decentralised units of scholarly communication. It helped to clarify and promote key notions: degree of control and self-publishing. They got good mileage towards helping people understand Linked Research and Solid. dokieli was just one of the key applications that helped people “get” Solid.

Later in 2019, I felt confident enough to proclaim that third-party control considered harmful to a scholarly publishers association, and challenged them to make a better offer than the open proposal, where, for example, for approximately 100 EUR/year, one can self-publish with unlimited individually-controlled resources; universally accessible, interoperable; publicly archivable; reusable by standards-compliant applications; social interactions; privacy respecting. I compared this option to the alternative, such as waiving rights and/or wasting thousands to put up a single PDF on a website. There has been zero responses to the challenge… it remains to be a mystery. What ultimately mattered were some questions for the academic community: What constitutes “open”? What are the constraints? How inclusive? Who ultimately controls your online identities and data? What are the real costs to participate? How about social pressure? Are we devising ethically grounded mechanisms and artifacts?

From 2019 onwards, a part of the work on dokieli aimed to improve the human and machine readability of (Solid) technical specifications by enabling the identification, description, and reuse of significant units of information, perhaps more prominently the requirements and concepts. Additionally, it demonstrated how specifications, test coverage, and implementation reports are linked, making it relatively easy for specification editors, authors, and implementers to understand how specifications can be improved by better observing what their documents expressed and where improvements are needed. However, there is still exciting work ahead before it truly becomes a specification editing environment, and I would be the first to say that.

In 2022, another procrastinator found the project exciting and became involved to modernise the codebase. Perhaps they were puzzled as to how something like this could continue for close to a decade without having tests!

The quality of the code is far from perfect and we didn't have the luxury of time to architect what grew organically. As any side project might go, some of the components and the libraries it depends on have aged and are due for updates, but all in all, it is still functional. There was no dedicated full-time team to work on the project but I hope that will change one day. dokieli has served its social and technical purpose abundantly, and I believe it still has untapped potential.

There have been all sorts of significant challenges that dokieli either works with, or around. Authentication mechanisms and libraries are among the first things that come to mind. If there is anything that's been technically problematic for using dokieli (and Solid for that matter), it has been the state of authentication (in Solid). Users creating and controlling their own WebIDs and storage space have also been challenging for a number of reasons that are beyond the scope of dokieli.

dokieli is an implementation which showcases various disparate parts – web standards, technologies, and social ideals – coming together in a cohesive way to serve readers and creators to take full advantage of the capabilities of the Web. There is more to the application than first meets the eye. It would be great to have more demos and improve its documentation.

During the development of dokieli, I drew inspiration from many other tools. The closest works to dokieli are the first web browser - or browser-editor rather - was called WorldWideWeb, and Amaya, a desktop web browser and editor that was developed by W3C from 1996-2012 to provide a framework for experimenting with and validating web specifications. It allows documents to be created and updated with remote access features.

There have been many people instrumental in the development of the ideas and the code of dokieli. You can see some of these great contributors on the GitHub repository. I am grateful to these folks who invested their time and expertise. It is equally important to acknowledge the folks that dismissed the ideas or the tool itself in some shape or form. Their views either strengthened the necessary activism or translated into a healthy dose of spite driven development.

dokieli has been a proof-of-concept in every sense of the word. Never a product, and never an end in itself. A playground, a laboratory, made with fun, but with progressing serious issues at its heart. The last ten years of dokieli have been an adventure.

What's next?

Whatever it is, let's make it so!