So, you want to go Open?

Samath Sitthida, Antoine Masson

December 4th 2018

Part 3 : Documentation

Documentation supports open source and versioning

Keep track with one’s own work

Understand others’ work

Internal communication

External communication

Identify source code/software

Identify code/software producers (myself or fellow coders)

Share/Reuse code

Contribute/Get contributions

Cite/Be cited

Credit/Be credited

Warning

Sharing and reusing IS NOT an invitation to plagiarism.

You still have to cite, and use the code according to the license

Documentation also supports

Software indexing/retrieval in :

  • software platforms

  • repositories

  • libraries

Software long term preservation/archiving

What you want to document

What you will mention

  • Description of the source code

  • Description of the software

Including * Code explanations

  • Usage

  • Author

  • Contact infos

  • UPIds (Unique Persistent IDentifiers)

  • License

Choose a licence and stick to it.

It can be very painful to change a licence (need to ask all the contributors).

Here is the list of Open Source Licences as suggested by the TTO https://tto.epfl.ch/scientists/software/choose_the_license/open_source_licenses/

MIT

BSD

Apache 2.0

GPL

LGPL

AGPL

Challenges

Dependencies documentation management

Various ways to code documentation

  • embedded documentation

  • supported documentation with a README file

  • supported documentation with metadata

  • documentation AND publication with a software paper

Embedded documentation

  • comments & annotations

  • documentation generation

A python Example

Doxygen Documentation

ReadtheDocs

Supported documentation with a README file

README file shall include

  $project
  ========
  
  $project will solve your problem of where to start with documentation,
  by providing a basic explanation of how to do it easily.
  
  Look how easy it is to use:
  
      import project
      # Get your stuff done
      project.do_stuff()
  
  Features
  --------
  
  - Be awesome
  - Make things faster
  
  Installation
  --------
  
  Install $project by running:
  
      install project
  
  Contribute
  --------
  - Issue Tracker: github.com/$project/$project/issues
  - Source Code: github.com/$project/$project
  
  Support
  --------
  
  If you are having issues, please let us know.
  We have a mailing list located at: project@google-groups.com
  
  License
  --------
  
  The project is licensed under the BSD license.

Supported documentation with metadata

DOAP (Description Of A Project) description files

CITATION files

(2 examples: CFF and codemeta)

EXAMPLE 1 : Citation File Format (CFF) file (yaml)

EXAMPLE 2 : codemeta file (json-ld)

Repository metadata

EXAMPLE 3 : Zenodo repository with doi attribution

http://doi.org/10.5281/zenodo.1469603

zenodo

Software paper

documentation AND publication

EXAMPLE 1 : Journal of Open Source Software (JOSS)

https://doi.org/10.21105/joss.01091

joss

EXAMPLE 2 : Journal of Cheminformatics

https://doi.org/10.1186/s13321-018-0297-4

joc

A few facts and figures

A quick scoping search in GitHub showed:

several thousand mentions of DOAP files,

a few hundred mentions of codemeta files,

a few dozens of cff files…

58300 DOIs for software (May 2018) to compare with

175 million DOI names in the world (July 2018)

Dozens of journals dedicated to scientific software

Focus on metrics

Software and source code developers as full scientific contributors

Metrics for acknowledgment

Metrics for data lead to metrics for code

Metrics for evaluation purpose

Is code/software “just data”?

For citation purpose, code/software is clearly not “just data”.

The topic of metrics for code is still new, gaining attention, fostering efforts such as FORCE11 working group.

Metrics for code:

traditional scientometrics

newer altmetrics

Defining metrics for code is

a work in progress

… and controversial

Food for thought

Remember

  • comments, annotations, doc generation
  • README
  • DOAP
  • metadata (CFF, codemeta)
  • software papers

Can you relate?

Stay tuned

  • Code plagiarism
  • Licenses
  • Dependencies documentation management
  • Metrics for code

Can you relate?

Further readings

Examples of source code (metadata) repositories

Software-specific metadata schemas

Software preservation initiatives

Open Source Licenses

Recommended readings