• This curriculum introduces students in grades 9 – 14 to core Artificial Intelligence concepts through real-world applications in Environmental Science. The course combines foundational AI principles with hands-on Python coding labs.

    Students will:

    • Understand the essential knowledge in AI and apply them to real-world challenges
    • Build confidence through practical, project-based learning
    • Learn responsible AI practices
    • Explore how AI can create positive social impact

    Using climate change as the central theme, students will identify real-world problems, select appropriate AI methods, and evaluate model performance using data and machine learning metrics.

    Prerequisite: Basic Python programming
    Target Audience: Grades 9 – 14 students

    Session #1 Introduction to AI in Climate Science

    [40 minutes Instruction] Applications of AI in climate science

    [10 minutes] Understanding Climate Change and its Impacts
    Climate change is one of the biggest challenges that human society is facing now. The increasing extreme climate hazards such as heatwaves, flood and wildfires cost lives, damages citizens’ properties and public infrastructures. Watch this expert-led video from Oxford University’s Department of Physics for an overview of climate impacts and how AI can help address them.

    [10 minutes] The Data-to-Decision Value Chain in Climate Science
    Climate change is a complex, multi-layered challenge requiring observation systems (sensors and satellites), large-scale data storage, advanced analytics, predictive models, policy frameworks, and coordinated action. To structure this process, the IPCC describes a data-to-decision value chain. We begin by understanding this chain, then explore how AI supports each stage.

    StageDescriptionExample
    Observation LayerCollect raw Earth system dataSatellites, ground sensors
    Modeling / Analytics LayerConvert data into interpretable indicatorsCO₂ emission, temperature, rainfall
    Impact & Risk Assessment LayerEstimate hazards and probabilitiesFlood, heatwave, drought risk
    Decision & Policy LayerTranslate insights into actionEmission controls, adaptation planning
    Outcome LayerReal-world societal and environmental benefitsReduced emissions, improved resilience
    Analytical ReportsSynthesize evidence for planning and governanceIPCC reports, NDCs, NAPs, IAM scenarios

    [20 minutes] AI Applications in Climate Science
    AI can help with monitoring, what-if analysis to suggest actions, and synthesizing complex information to communicate across experts in different domains. Here are applications currently available online:

    • Climate TRACE uses satellite data and AI to track greenhouse gas emissions from hundreds of thousands of sources worldwide, from power plants to farms, broken down by country, city, or individual facility. It also helps experts estimate how much emissions could be cut by taking specific actions.
    • Carbon Mapper uses satellites and aircraft to detect and pinpoint facilities leaking unusually large amounts of methane or CO₂. Its public data portal helps governments, regulators, and companies identify where the biggest fixes are needed and take direct action.
    • Google’s Heat Resilience tool helps city planners test different cooling strategies, such as planting trees or installing cool roofs, before committing resources, by showing where each approach would have the greatest impact on temperature and public health.
    • ChatClimate lets anyone ask questions about climate risk in plain language, with answers grounded in the latest IPCC reports. It helps bridge the gap between climate scientists and professionals in fields like finance, urban planning, or policy who need to act on that knowledge.

    Session #2 Foundations of AI and Machine Learning

    [40 minutes Instruction] Essential AI and machine learning concepts

    [7 minutes] Perception of AI Systems
    In climate science, computers “sense” the world through satellites and ground-based sensors, collecting imagery and environmental measurements (e.g., temperature, gases, land surface data).

    [8 minutes] Representation
    AI converts raw data into numerical forms such as features and embeddings. Features describe specific properties (e.g., vegetation index), while embeddings act like compact summaries capturing overall patterns and similarities.

    [15 minutes] Learning
    Students are introduced to supervised learning (using labeled examples), unsupervised learning (discovering patterns without labels), and reinforcement learning (learning through feedback and rewards), with examples tied to environmental analysis.

    [5 minutes] Natural Language Interaction
    Large Language Models (LLMs) enable interaction with AI systems using everyday language, supporting tasks like querying datasets, explaining results, or exploring scientific reports.

    [10 minutes] Responsible AI Considerations
    AI predictions are probabilistic and subject to uncertainty. Key concepts include precision, recall, bias and explainability.

    Session #3 Land Cover Discovery from Satellite Imagery

    [60 minutes Lab] Exploring perception, representation, learning, and evaluation metrics

    Understanding crop type distribution is important for climate research, food security analysis, water resource management, and sustainability planning. Because detailed crop maps like the USDA NASS Cropland Data Layer (CDL) are only available for the United States, this session highlights how AI can help bridge geographic data gaps. By using embeddings and the K-Nearest Neighbor (KNN) algorithm, students see how machine learning models trained on U.S. data can generalize to other regions.

    [5 minutes] Explain: Feature Representation (ESA WorldCover Example)
    Explain feature-based representation using ESA WorldCover, where each pixel encodes land cover classes (e.g., built-up, forest, water, cropland).

    [7 minutes] Demo: Visualizing ESA WorldCover in Colab
    Display land cover categories for Ann Arbor as a colored layer on the map and interpret the class labels.

    [5 minutes] Demo: Explore U.S. Cropland Data Layer (CDL)
    Examine how the USDA NASS CDL dataset provides detailed, pixel-level crop type classifications across the United States, including crops such as corn, soybeans, wheat, and cotton.

    [5 minutes] Explain: Embedding Representation (Alpha Earth Embeddings)
    Introduce embeddings as compact numerical vectors that summarize the semantic characteristics of imagery. Explain that instead of using raw pixels, AI models encode each image or location into a list of numbers capturing patterns such as texture, vegetation, and land structure. Similar landscapes produce similar embeddings, enabling comparison, clustering, and machine learning tasks like classification.

    [5 minutes] Demo: Extracting Embeddings by Coordinates
    Show how to retrieve Alpha Earth embeddings for selected geographic locations in Colab.

    [10 minutes] Explain: Extending Beyond the U.S. with Machine Learning (KNN)
    Since detailed crop type labels (CDL) are only available for the U.S., we use Alpha Earth embeddings and the KNN algorithm to generalize and predict crop types in regions outside the U.S. Explain how KNN classifies a location by comparing its embedding to the most similar labeled examples and assigning the majority crop type among its nearest neighbors.

    [8 minutes] Demo: KNN Binary Classification (Corn vs. Not Corn)
    Train the KNN model in Colab using USDA NASS CDL data from Ann Arbor, MI, then test and visualize predictions for South Bend, IN.

    [10 minutes] Explain and Demo: Model Evaluation Metrics
    Explain and demo accuracy, precision, and recall in the context of crop detection.

    [5 minutes] Demo: Inferencing Outside the U.S.
    Apply the trained KNN model to Southern Ontario, Canada, demonstrating how the model extends detailed crop classification beyond the U.S.

    Homework: Build a prediction model to identify a different crop type

    Session #4 Interacting with Satellite Imagery Using Generative AI

    [40 minutes Lab] Natural language interaction & Responsible AI considerations

    In this session, students will explore how to design interactive user experience experiments in Colab and how Generative AI can enable more natural human-computer interaction.

    [10 minutes] Demo: Interactive User Experience in Colab
    Create coordinate input widgets (float values) in Colab. Upon clicking the display button, visualize land cover categories from ESA WorldCover dataset as a colored map layer for the selected location.

    [10 minutes] Demo: Visualizing Land Cover Distribution
    Display a bar chart summarizing land cover categories by area (km²), such as tree cover, grassland, and cropland, to improve interpretability.

    [10 minutes] Demo and Explain: Natural Language Interaction for Land Cover Queries
    Explain that entering latitude and longitude coordinates can be difficult for users. Instead, demonstrate querying land cover using a region name (e.g., city, county, or ZIP code). Show how an LLM API with function calling converts the region name into structured geographic coordinates (longitude and latitude defining a bounding box), ensuring consistent outputs for visualization and analysis.

    [10 minutes] Responsible AI Considerations
    Discuss that LLMs can produce hallucinations (confident but incorrect outputs). While benchmarks (e.g., question answering or reasoning tests) provide snapshots of model performance, they are limited to specific datasets and do not guarantee accuracy in real-world use. Users may issue queries that differ significantly from benchmark scenarios.

    Key principles for responsible AI design:

    • Constrain outputs & keep designs purposeful: In this use case, the LLM is only used to interpret region names, with outputs restricted to structured geographic coordinates. Use LLMs only where they add clear value, keeping the system simple and useful.
    • Transparency: Display maps and coordinates so users can verify results, understand model behavior, and detect potential errors.
    • Logging & reproducibility: Record system inputs and outputs to enable debugging, auditing, and analysis of failure cases.

    Final Project

    Natural language interaction with the crop type prediction model

  • The Lost City of Z is the name given by Colonel Percy Harrison Fawcett, a British surveyor of the early 20th century, to an indigenous city that he believed had existed in the jungle of the Mato Grosso state of Brazil. Based on early histories of South America and his own explorations of the Amazon River region, Fawcett theorized that a complex civilization had once existed there, and that isolated ruins may have survived. Fawcett and two companions disappeared during an expedition to find evidence of the hypothesized civilization in 1925.

    In this lab, we will leverage Multimodal LLMs to discover evidence to the lost city of Z in the Amazon Rainforest. We will predict the potential archeological sites from mining satellite imagery.

    From this lab, you will learn the following concepts in AI science

    • Large Language Models (LLMs)
    • Multimodal machine learning
    • Data Science Metrics: Precision, Recall, Accuracy
    • Few-shot learning in Generative AI

    Lab with Code

  • In this lab, you will develop an AI chatbot agent that creates personalized photobooks from a large collection of photos, guided by real-time user feedback.

    While Large Language Models (LLMs) are powerful, they face challenges with complex, multi-modal tasks like photobook creation. Most LLMs cannot process 100+ images effectively, and even with smaller sets, they often struggle to capture the context and user intent required for meaningful arrangement.

    To overcome these limitations, you’ll adopt an agentic approach—breaking the problem into manageable steps and using the LLM’s strengths within a larger system. Here, the chatbot will orchestrate tools and reasoning steps rather than working in isolation.

    By the end of this lab, you will gain hands-on experience in:

    • LLMs and Multi-modal LLMs
    • Agentic System Design
    • Structured Output Generation
    • Tool Integration with Gemini Function Calling

    Please contact haoyun.feng@arcknow.com for access to the lab.


  • In a Nutshell …

    🔥Trending in AI Science: Physical Reasoning

    AI is moving beyond pattern recognition – toward capturing the laws of physics. From modeling planetary motion to simulating stars using physics-informed neural networks, researchers are exploring how AI can reason about the physical world to power breakthroughs in science and engineering.

    💊 Diffusion Models in Drug Discovery

    Diffusion models, popularly known for high-quality image generation, are now designing novel drug candidates. By capturing the complexity of molecular structures and optimizing for safety, stability, and effectiveness, they’re accelerating the path to faster, smarter drug discovery.


    Deeper Dive

    🔥Trending in AI Science: Physical Reasoning

    What are scientific discoveries? They are abstract rules that generalize across countless observations. Newton’s law of motion applies to falling apples, orbiting moons, and flying rockets. A logistic regression model generalizes from data to predict outcomes. Large Language Models (LLMs) generalize across infinite amounts of text.

    The scientific process starts with observations: for example, the moon changes shape and position in the sky. From repeated observations, we see patterns – the moon always follows a cycle. From these patterns, scientists summarize laws: the moon orbits Earth according to Newton’s laws of motion.

    Generative models in AI can already capture complicated patterns in language, images, and video. But now researchers are asking a deeper question:

    Can AI capture not just patterns, but the underlying physical laws of our universe?

    If an AI model can generate videos of objects moving, can it also simulate the true physics – predicting how every object interacts? If yes, the applications could be profound: from accelerating spaceship design to making travel safer and cheaper – so perhaps one day, we really can vacation on the moon.

    But here’s the challenge: bigger models don’t automatically learn physics.

    Both concluded: foundation models mimic training data patterns but fail to discover abstract physical rules.

    The good news?

    Enter Physics-Informed AI. Physics-Informed Neural Networks (PINNs), described in papers like Differentiable Stellar Atmospheres with Physics-Informed Neural Networks and Sub-Sequential Physics-Informed Learning with State Space Model, are neural networks that directly incorporate physical laws into their training. Instead of learning solely from data, the loss function also rewards models that respect established physical principles and penalizes those that violate them.

    Similarly, “Causal-PIK: Causality-based Physical Reasoning with a Physics-Informed Kernel” leverages dynamics models to predict causal effects grounded in physics, integrating them into Gaussian optimization via kernel functions. Put simply, it can find the best move in a game faster than other algorithms – by reasoning about what will happen next based on the laws of physics. This approach mirrors how humans intuitively anticipate outcomes in the real world.

    Applications of Physical Reasoning

    Predicting Earthquakes
    “Broadband Ground Motion Synthesis by Diffusion Model with Minimal Condition” demonstrates how conditional diffusion models – popularly known for generating high-quality images – can be applied to geoscience. In this case, they generate realistic simulations of earthquake ground motion, supporting applications such as earthquake risk assessment, early warning systems, disaster preparedness and response.

    Understanding How Stars Work
    “Differentiable Stellar Atmospheres with Physics-Informed Neural Networks” asks: Can we determine a star’s temperature, composition, gravity, velocity, and internal structure from light-years away? Astronomers study the spectra of starlight to infer these properties. This paper uses a physics-informed neural network (PINN) to model the relationship more accurately, incorporating hydrostatic equilibrium as a physical constraint during training. The result brings us closer to understanding the inner workings of stars.

    💊 Diffusion Models in Drug Discovery

    Diffusion models – originally built for generative AI for high quality image generation – are now making waves in drug discovery.

    They excel at:

    The result? Diffusion models can generate diverse, high-quality drug candidates faster and more efficiently than traditional trial-and-error methods – potentially speeding up the path from idea to treatment.