Logo OKDPOKDP

The modern open source data platform

A free, open-source, and cloud-native data platform designed for Kubernetes

View on Github

OKDP is an open source and free data stack designed and implemented for Kubernetes under apache V2 licences. Openness and adaptability are key considerations in the design of the Open Kubenetes Data Platform. It offers you a handpicked collection of the top open-source data technologies, like Apache Spark, Jupyterhub, and Trino with a native and full Kubernetes integration.

OKDP has a different strategy than other existing options, which either promote their proprietary solutions or increase vendor lock-in. Every data layer in OKDP is quickly addable or removeable and can operate everywhere: on-premises or in the private and public cloud.

TOSIT, The Open-Source I Trust

TOSIT is an association that promotes community-driven initiatives to create truly open-source technologies and platforms. The association brings together numerous companies and administrations including DGFiP (Direction Générale des Finances Publiques), BPCE (Banque Populaire, Caisse d'Epargne et Natixis), Société Générale, among others.

OKDP is currently mainly implemented and managed by DGFiP.

Participation in OKDP is open to all, with the aim of ensuring that the technology stack is accessible, efficient, and powerful for everyone's.

Features

  • Data Centric

    OKDP aims mainly to provide different data technologies covering a wide range of architecture patterns like data mesh, data fabric, etc. by respecting data lifecycle.

  • Cloud Native

    OKDP architecture decomposes components into loosely coupled services to help manage complexity and improve the speed, agility, and scale of software delivery.

  • Open Source

    All the source code of the components in use is available on our GitHub with a compatible Apache license.

  • 100% Free and Community Driver

    All the technologies and services delivered by the TDP project are free of use. It is built by its users to answer critical business requirements.

  • Production Readiness

    OKDP ships components with a proven track record in terms of performance and community engagements. No trade-off on security and stability, development and initial support in secure mode and in HA with the aim of simplification and efficiency.

  • Technological Independence

    Full or unit deployment control without vendor lock-in and in compliance with local regulations.

  • Environment Agnostic

    OKDP targets multiple environments including public and private clouds as well as on-premise bare metal infrastructures.

  • Automatic Build

    Every service/component's release is built from a reference branch on the GitHub repository using Github Actions and act for local build.

  • Automatic deployment

    All components/services are deployed automatically with a native integration.

Architecture

These tools and platforms form an integral part of the OKDP architecture.

Roadmap

  1. JupyterHub: On-Demand Notebooks

    • Automatic building of JupyterLab images via GitHub Actions.
    • Providing a customized HELM chart based on the Jupyter Community's one.
  2. Apache Spark : A large-scale data analytics engine

    • Development of an authentication module for Spark History Server and Spark UI, including its release management to Maven Central.
    • Provide a customized HELM chart for the Spark History Server.
    • Provide customized OKDP images.
  3. Trino & Superset

    • Provide customized HELM chart.
    • Provide OKDP images.
  4. OKDP Sandbox with User Guide

    • Provide a sandbox to deploy OKDP components locally.
    • Provide a User Guide documentation to deploy OKDP components on a kubernetes cluster.