Data POPs

What is a Data POP (Point of Presence)?

A Data POP (Point of Presence) is a software framework that facilitates the use of different resources and services (docker containers, staging services, streaming services, etc.) to enable smooth data discovery, access, and management to create ML workflows

Data POPs in the NDP Hub

The NDP Hub can be integrated with various Data POPs (Points of Presence) to create a cohesive user experience and provide a unified platform for data access, workspace management, and computational resource utilization. The workflow will encompass both frontend user interactions and backend system integration.

General Workflow

Key Components of a Data POP

  • Authentication and Authorization using Keycloak
  • Resource Tracking enabled through sending system info to NDP Hub
  • Workspace Capabilities allows users to browse through datasets from NDP and Data POP catalogs
  • Jupyter Instance Launching with options to choose from running on Nautilus or a specific Data POP (configuration allowed: number of CPUs, RAM, storage, etc)
  • Technology Stack consists of backend integration using REST APIs and system monitoring using Prometheus

Sample Workflow