Appearance
outline: deep
NGA Unclassified Data Lake (NUDL) 101
Origin
The NGA Unclassified Data Lake, known as NUDL, is a groundbreaking initiative commissioned by Congress as a Congressionally Directed Action (CDA). NUDL's primary purpose is to provide a collaborative platform for users across academia, industry, and NGA mission partners. This platform empowers users to swiftly onboard and foster collaboration within the GEOINT (Geospatial Intelligence) community.
NUDL serves as both a data lake and a sandbox environment, granting access to vast repositories of imagery data. Its mission is to enhance research productivity and facilitate collaboration among industry, academia, and NGA. It offers easy access to high-quality commercial imagery and functions as a space for prototyping, testing, and developing innovative analytic and Artificial Intelligence/Machine Learning (AI/ML) capabilities and geospatial products. The development of NUDL is guided by a foundational roadmap which includes access to cloud services, industry-standard GEOINT tools, and a data and model exchanges.
To request a NUDL Account on XC please see the NUDL XC Onboarding Guide. Otherwise, you can access NUDL on UC with your Common Access Card (CAC) without any type of onboarding process.
What Data is in NUDL?
Today, there are seven primary data sets in NUDL.
Electro-Optical Vendors:
- BlackSky
- Planet Labs
- Maxar
Synthetic Aperture Radar Vendors:
- ICEYE
- Umbra Lab Inc.
- Capella Space
Digital Elevation Models:
- Shuttle Radar Topography Mission
Capabilities and Benefits
NUDL is now operational within the Commercial Cloud (XC) and Unclassified Cloud (UC) frameworks. The XC environment hosts approximately 200TB of commercial imagery from providers like Maxar, Airbus, Capella, and ICEye. The UC environment provides access to a larger subset of commercial data from these and new providers in the future. Through the XC environment, NUDL aims to establish an open, stable, and reliable platform that fosters effective collaboration between academia and industry. This Initial Operational Capability (IOC) milestone release allows users to onboard, access commercial imagery data, client-side and web-based Advanced Imagery Services (AIS) applications, and perform essential AI/ML functions. The data flow and processes within NUDL are illustrated in the accompanying chart.

NUDL's initial capabilities encompass:
- A diverse range of data holdings from various vendors.
- Discovering imagery using the Image Discovery web application and ArcGIS Pro Add-In.
- The ability to import existing AI/ML models.
- The capacity to execute these models on NUDL's data.
- The utilization of Cyber Analysis Modification Program (CAMP) for secure data ingestion.
- Seamless connectivity with Amazon SageMaker Studio.
- Enterprise Identity and Access Management illustrated below

NUDL's Mission Application Nexus is illustrated below and offers an intuitive User Experience to direct users to the appropriate function

The ongoing User Acceptance Testing (UAT) phases ensure that NUDL's performance and features are intuitive and scalable for an expanding user base. Notably, a user-friendly landing page has been introduced to guide users through end-to-end workflows. The landing page includes:
SpatioTemporal Asset Catalog (STAC) Data Catalog: Allows users to view and query data holdings, access metadata, and pan/scan images.
API Documentation: Provides guidance on connecting external applications with NUDL.
User Documentation: Offers how-to guides, walkthrough videos, deployment guides, and runbooks.
ArcGIS Enterprise: Foundational software system for GIS, powering mapping and visualization, analytics, and data management.
AIS Image Discovery Web Application: Offers users the ability to view and query AIS imagery products and access metadata.
AIS Image Discovery Pro Add-In: Empowers GIS-savvy users the ability to search the AIS imagery holdings and extend their analysis using ArcGIS Pro COTS technology.
Amazon SageMaker Studio: Supports the importation of custom models, resource monitoring, and AI/ML workloads.
AI/ML Workbench: Enables team-building, customization of labeling jobs, and the creation of labeled training datasets.
Contact Us: Offers users assistance channels for inquiries, support, or suggestions to NUDL@NGA.mil
Experimenting with NUDL offers numerous benefits, including:
- Enabling researchers to access and analyze vast datasets that were previously inaccessible.
- Providing a collaborative environment for joint projects and knowledge sharing.
- Ensuring secure data management, crucial for sensitive information.
- Streamlining the development and testing of machine learning models.
- Keeping researchers updated on the latest advancements in AI/ML.
- Offering valuable insights to support the GEOINT community.
- Enabling researchers to access and extend analysis of the vast datasets previously inaccessible.
Real World Mission Impact
Following a successful IOC, the NUDL team received approval to test a mobile version, in which NUDL was deployed to a vehicle equipped with AWS Snowball, which utilized drones for real-time image capture and cataloging. This successful demonstration was showcased at the UN HQ in New York, New York.
In a humanitarian context, NUDL proved indispensable during a catastrophic earthquake in Turkey. With its geospatial imagery support, data cataloging, and operational capabilities, NUDL was swiftly deployed to aid humanitarian efforts within 12 hours of the request.
NUDL's ability to transition from an enterprise application to a humanitarian tool with ease presents unprecedented use cases.
Future
NUDL's long-term objective is to foster collaboration through an AI/ML workbench, allowing users to share work and receive feedback from peers in a sandbox environment.
Immediate focus areas include:
- Implementing AI/ML Data Preprocessing Tools, encompassing data labeling, ontology management, automated pipelines, and data exchange.
- Enabling Model Movement/Collaboration from Low to High Side, with mechanisms for secure model transport and exchange.
- Exploring Bring-Your-Own Model Frameworks, offering prebuilt environments and experiment tracking.
- Examining integrations for data and model refinement and reusability.
- Increasing scale of imagery by volume and vendor type.
- Expanding the breadth of service consuming applications with metering for alignment of need.
- Integrating with Esri's ArcGIS Enterprise for end-to-end geospatial workflows.
Conclusion
The NGA Unclassified Data Lake, NUDL, establishes a standardized IT approach for collaboration and innovation between industry, academia, and NGA. It provides a robust set of capabilities for AI/ML, geospatial, and data science applications, drawing from diverse sources of commercial imagery. NUDL is a testament to the commitment of fostering collaboration and innovation within the GEOINT community.