Building a Multi-Layer Perceptron for FEM Predictions - AI Portfolio Project

A step-by-step walkthrough of building a simple MLP neural network to predict fracture mechanics results. My journey learning neural networks through a practical engineering application.

Posted by : anachary on Jan 25, 2026

Category : AI

Building a Multi-Layer Perceptron for FEM Predictions - AI Portfolio Project

This is a portfolio project where I built my first neural network from scratch - a Multi-Layer Perceptron (MLP) that predicts engineering simulation results. Here’s what I learned and the steps I took to build it.

Project Overview

Goal: Build a Multi-Layer Perceptron (MLP) neural network that predicts fracture mechanics results for cracked structures—specifically, whether a crack will grow and cause failure.

The Engineering Problem: Traditional Finite Element Method (FEM) simulations can take hours to days to analyze crack behavior in structures. This project creates a neural network surrogate model that predicts the same results in milliseconds.

What is FEM? The Lego Analogy

Imagine you want to understand how a rubber ball squishes when you press on it. The physics of a curved, squishy object is incredibly complex. But here’s the trick:

Divide the ball into tiny pieces (like Lego blocks)
Apply simple physics to each piece
Connect all the pieces and see what happens

This is FEM in a nutshell—break complex objects into simple pieces (finite elements), solve each piece, and combine the results.

The Trade-off: More Accuracy = More Time

Number of Elements	Accuracy	Time
100	Rough	Seconds
10,000	Good	Minutes
1,000,000	Excellent	Hours
100,000,000	Perfect	Days to Weeks

For complex designs—airplane wings, car crash simulations, cracked structures—you need millions of elements. And that means waiting. A lot.

The AI Solution: Learn Once, Predict Forever

Instead of running expensive FEM simulations every time, what if AI could learn from thousands of previous simulations and predict new results instantly?

Think of it like a chef:

Traditional FEM = Making every dish from scratch (slow but accurate)
AI Surrogate = A chef who’s made 10,000 similar dishes and knows the patterns (fast and good enough)

Physics-Informed Training Data: For this learning project, I didn’t have access to real FEM simulation data or expensive commercial software. Instead, I used the Tada-Paris-Irwin formula—a well-established analytical equation from fracture mechanics that calculates stress intensity factors based on crack geometry and loading conditions.

By generating synthetic training data from this physics-based formula, I could:

Create 5,000+ training examples instantly (no need for real FEM data)
Learn neural network fundamentals without expensive software licenses
Ensure the neural network learns physically correct relationships
Focus on the ML implementation rather than data collection

This approach is a form of Physics-Informed Neural Networks (PINNs)—where the AI learns from known physics laws embedded in the training data. While real-world applications would use actual FEM simulation data, this physics-based synthetic data was perfect for learning how to build and train neural networks for engineering problems.

Why this project?

Learn neural network fundamentals through hands-on implementation
Explore physics-informed machine learning (PINN concepts)
Apply ML to a real engineering problem (not just toy datasets)
Build a complete ML application (model + API + UI + deployment)
Create a portfolio piece demonstrating end-to-end ML skills

Tech Stack: Python, PyTorch, FastAPI, React + TypeScript, Docker

Repository: github.com/anachary/mvp-fem-surrogate-engine

Understanding the Problem

The neural network needs to learn this relationship:

Inputs (5 features):

W: Plate width (mm)
H: Plate height (mm)
a: Crack length (mm)
σ: Applied stress (MPa)
K_IC: Fracture toughness (MPa√mm)

Outputs (3 predictions):

K_I: Mode-I Stress Intensity Factor (measures crack severity)
K_II: Mode-II Stress Intensity Factor (shear mode)
Safety Factor: Is the structure safe? (K_IC / K_I)

Step 1: Understanding Multi-Layer Perceptron (MLP)

An MLP is the simplest type of neural network. It’s called “multi-layer” because it has:

Input layer: Takes your data (5 features in our case)
Hidden layers: Process the data through mathematical transformations
Output layer: Produces predictions (3 values in our case)

Here’s my MLP architecture:

Input Layer (5)          Hidden Layers              Output Layer (3)
    │                                                      │
    W ─────┐                                               ├── K_I
    H ─────┤                                               │
    σ ─────┼──→ [64 neurons] → [64] → [64] ────────────→  ├── K_II
  K_IC ────┤                                               │
    α ─────┘                                               └── Safety Factor

Why this architecture?

3 hidden layers with 64 neurons each: Simple enough to train quickly, complex enough to learn patterns
Total parameters: ~9,000 (very lightweight!)
Activation function: Tanh (helps with smooth gradients)
No dropout or fancy tricks: Keep it simple for learning

Step 2: Building the MLP in PyTorch

Here’s the actual code I wrote for the neural network:

import torch
import torch.nn as nn

class FEMSurrogate(nn.Module):
    def __init__(self, input_dim=5, output_dim=3, hidden_dims=[64, 64, 64]):
        super().__init__()

        # Build the layers
        layers = []
        prev_dim = input_dim

        # Hidden layers
        for hidden_dim in hidden_dims:
            layers.append(nn.Linear(prev_dim, hidden_dim))
            layers.append(nn.Tanh())  # Activation function
            prev_dim = hidden_dim

        # Output layer
        layers.append(nn.Linear(prev_dim, output_dim))

        self.network = nn.Sequential(*layers)

    def forward(self, x):
        return self.network(x)

What’s happening here?

nn.Linear: Creates a fully connected layer (each neuron connects to all neurons in next layer)
nn.Tanh(): Activation function that adds non-linearity (helps learn complex patterns)
nn.Sequential: Chains all layers together
forward(): Defines how data flows through the network

Step 3: Preparing the Training Data

Neural networks learn from examples. I needed data showing the relationship between inputs and outputs.

Option 1: Generate Synthetic Data (What I Started With)

I used the Tada-Paris-Irwin formula - a mathematical formula from fracture mechanics:

\[K_I = \sigma \sqrt{\pi a} \cdot F(\alpha)\]

Where $\alpha = a/W$ (crack ratio) and:

\[F(\alpha) = 1.12 - 0.231\alpha + 10.55\alpha^2 - 21.72\alpha^3 + 30.39\alpha^4\]

This let me generate 5,000 training examples instantly:

python scripts/generate_data.py --synthetic -n 5000

Why synthetic data?

Fast to generate (no waiting for simulations)
Perfect for learning and testing
Can create as many examples as needed

Option 2: Real FEM Data (For Better Accuracy)

Later, I integrated real FEM solver (FEniCSx) to get more realistic data:

python scripts/generate_data.py --fem -n 100

The data is saved in a simple format:

# Data structure
{
    'W': [50, 60, 70, ...],           # Plate widths
    'H': [100, 120, 140, ...],        # Plate heights
    'sigma': [100, 150, 200, ...],    # Applied stress
    'K_IC': [1500, 1600, ...],        # Fracture toughness
    'alpha': [0.2, 0.3, ...],         # Crack ratios
    'K_I_fem': [523, 645, ...],       # Target outputs
    'K_II_fem': [0, 0, ...],          # Target outputs
    'SF_fem': [2.87, 2.48, ...]       # Target outputs
}

Step 4: Training the Neural Network

This is where the magic happens! The network learns by:

Making predictions
Comparing predictions to actual values
Adjusting weights to reduce errors
Repeating thousands of times

Here’s my training code:

import torch.optim as optim

# Initialize the model
model = FEMSurrogate(input_dim=5, output_dim=3, hidden_dims=[64, 64, 64])

# Loss function: Mean Squared Error
criterion = nn.MSELoss()

# Optimizer: Adam (adjusts weights during training)
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
for epoch in range(500):
    # Forward pass: make predictions
    predictions = model(train_inputs)

    # Calculate loss: how wrong are we?
    loss = criterion(predictions, train_targets)

    # Backward pass: calculate gradients
    optimizer.zero_grad()
    loss.backward()

    # Update weights
    optimizer.step()

    if epoch % 50 == 0:
        print(f"Epoch {epoch}, Loss: {loss.item():.4f}")

Key concepts I learned:

Loss function (MSE): Measures how wrong the predictions are
Optimizer (Adam): Smart algorithm that adjusts weights
Learning rate (0.001): How big the weight adjustments are
Epochs (500): Number of times to go through all training data
Backpropagation: How the network learns (calculates gradients)

Running the Training

I made it simple with a command-line script:

# Train the model
python scripts/train_surrogate.py --train data/fem_synthetic.npz --epochs 500

Training results:

Time: ~3-5 minutes on CPU
Final loss: ~0.001 (very low = good predictions!)
Model saved to: checkpoints/surrogate_best.pt

The script automatically:

Splits data into training (80%) and validation (20%)
Normalizes inputs/outputs for stable training
Saves the best model based on validation loss
Stops early if loss stops improving

Step 5: Making Predictions (Inference)

Once trained, using the model is simple:

# Load the trained model
model = FEMSurrogate()
model.load_state_dict(torch.load('checkpoints/surrogate_best.pt'))
model.eval()  # Set to evaluation mode

# Make a prediction
input_data = torch.tensor([[50.0, 100.0, 10.0, 100.0, 1500.0]])  # W, H, a, σ, K_IC
prediction = model(input_data)

# Results
K_I = prediction[0][0].item()      # 523.45
K_II = prediction[0][1].item()     # 0.0
safety_factor = prediction[0][2].item()  # 2.87

Prediction speed: ~0.1 milliseconds on CPU!

Step 6: Building the REST API

To make the model accessible, I built a FastAPI backend:

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class PredictionRequest(BaseModel):
    W: float
    H: float
    a: float
    sigma: float
    K_IC: float

@app.post("/predict-crack-sif")
def predict(request: PredictionRequest):
    # Prepare input
    input_tensor = torch.tensor([[
        request.W, request.H, request.a,
        request.sigma, request.K_IC
    ]])

    # Make prediction
    with torch.no_grad():
        output = model(input_tensor)

    return {
        "K_I": output[0][0].item(),
        "K_II": output[0][1].item(),
        "safety_factor": output[0][2].item()
    }

Run the API:

uvicorn src.api.app:app --reload

Test it:

curl -X POST http://localhost:8000/predict-crack-sif \
  -H "Content-Type: application/json" \
  -d '{"W": 50, "H": 100, "a": 10, "sigma": 100, "K_IC": 1500}'

Step 7: Building the Frontend

I created a React TypeScript UI for easy interaction:

Input sliders for all parameters
Real-time predictions as you adjust values
Color-coded safety warnings (green/yellow/red)
Visualization of results

Step 8: Docker Deployment

Everything packaged in Docker for easy deployment:

# Start the full stack (API + UI)
docker-compose up --build -d

# Access:
# - UI:  http://localhost:3000
# - API: http://localhost:8000

Project Results

Performance:

Training time: ~3-5 minutes on CPU
Prediction time: ~0.1 milliseconds
Model size: ~36 KB
Accuracy: < 5% error on validation data

What I achieved:

✅ Built my first neural network from scratch
✅ Learned PyTorch fundamentals
✅ Understood backpropagation and gradient descent
✅ Created a full-stack ML application
✅ Deployed with Docker

Key Learnings

1. MLPs are Simple but Powerful

You don’t need complex architectures for many problems. A simple 3-layer MLP with 64 neurons each can learn complex patterns effectively.

2. Data Preparation is Critical

Normalize inputs/outputs (makes training stable)
Split into train/validation sets (prevents overfitting)
Start with synthetic data (fast iteration)
Validate with real data (ensures accuracy)

3. Training is Iterative

My first model didn’t work well. I learned to:

Adjust learning rate (too high = unstable, too low = slow)
Monitor validation loss (detect overfitting)
Use early stopping (save best model)
Experiment with architecture (layers, neurons, activation functions)

4. Deployment Makes it Real

Building the API and UI transformed this from a learning exercise into a usable tool. This taught me:

FastAPI for serving ML models
React for building interfaces
Docker for packaging everything
REST API design

What’s Next?

I’m planning to extend this project:

Try different architectures (deeper networks, skip connections)
Implement uncertainty quantification
Add more complex crack geometries
Integrate with real FEM software
Optimize for mobile deployment

How to Run This Project

The complete code is on GitHub: github.com/anachary/mvp-fem-surrogate-engine

Quick Start (3 steps):

# 1. Generate training data
python scripts/generate_data.py --synthetic -n 5000

# 2. Train the MLP
python scripts/train_surrogate.py --train data/fem_synthetic.npz --epochs 500

# 3. Run the API
uvicorn src.api.app:app --reload

Or use Docker:

docker-compose up --build -d
# UI at http://localhost:3000
# API at http://localhost:8000

Project Structure

mvp-fem-surrogate-engine/
├── src/
│   ├── core/
│   │   ├── surrogate.py          # MLP model definition
│   │   ├── simple_trainer.py     # Training loop
│   │   └── physics_losses.py     # Tada-Paris-Irwin formula
│   └── api/
│       └── app.py                # FastAPI server
├── scripts/
│   ├── generate_data.py          # Data generation
│   └── train_surrogate.py        # Training script
├── ui/                           # React frontend
└── checkpoints/                  # Saved models

Technologies Used

PyTorch: Neural network framework
NumPy: Data manipulation
FastAPI: REST API framework
React + TypeScript: Frontend
Docker: Containerization

Conclusion

This project taught me the fundamentals of neural networks through hands-on implementation:

Core ML Concepts:

Multi-Layer Perceptron architecture
Forward and backward propagation
Loss functions and optimization
Training, validation, and testing
Model deployment

Practical Skills:

PyTorch for building neural networks
Data preparation and normalization
Hyperparameter tuning
Building ML APIs with FastAPI
Full-stack ML application development

Key Takeaway: You don’t need complex models or massive datasets to build useful ML applications. A simple MLP with good data and proper training can solve real-world problems effectively.

Want to learn more about my AI projects? Check out my other posts on neural networks and machine learning.

Project Repository: github.com/anachary/mvp-fem-surrogate-engine

References and Research Papers

This project was inspired by cutting-edge research in physics-informed machine learning and surrogate modeling:

Physics-Informed Neural Networks (PINNs)

Raissi, M., Perdikaris, P., & Karniadakis, G. E. (2019) Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations Journal of Computational Physics, 378, 686-707. DOI: 10.1016/j.jcp.2018.10.045 Key contribution: Introduced PINNs that incorporate physics laws directly into neural network training.
Karniadakis, G. E., Kevrekidis, I. G., Lu, L., Perdikaris, P., Wang, S., & Yang, L. (2021) Physics-informed machine learning Nature Reviews Physics, 3(6), 422-440. DOI: 10.1038/s42254-021-00314-5 Key contribution: Comprehensive review of physics-informed ML approaches.

Neural Operators and Surrogate Models

Lu, L., Jin, P., Pang, G., Zhang, Z., & Karniadakis, G. E. (2021) Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators Nature Machine Intelligence, 3(3), 218-229. DOI: 10.1038/s42256-021-00302-5 Key contribution: DeepONet architecture for learning operators (function-to-function mappings).
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., & Anandkumar, A. (2020) Fourier Neural Operator for Parametric Partial Differential Equations arXiv preprint arXiv:2010.08895. arXiv: 2010.08895 Key contribution: Fourier Neural Operator (FNO) for fast PDE solving.

Fracture Mechanics and Engineering Applications

Tada, H., Paris, P. C., & Irwin, G. R. (2000) The Stress Analysis of Cracks Handbook (3rd Edition) ASME Press. Key contribution: Standard reference for stress intensity factor formulas (Tada-Paris-Irwin formula used in this project).
Goswami, S., Anitescu, C., Chakraborty, S., & Rabczuk, T. (2020) Transfer learning enhanced physics informed neural network for phase-field modeling of fracture Theoretical and Applied Fracture Mechanics, 106, 102447. DOI: 10.1016/j.tafmec.2019.102447 Key contribution: Application of PINNs to fracture mechanics problems.

Open-Source Tools and Frameworks

Lu, L., Meng, X., Mao, Z., & Karniadakis, G. E. (2021) DeepXDE: A deep learning library for solving differential equations SIAM Review, 63(1), 208-228. DOI: 10.1137/19M1274067 Tool: DeepXDE GitHub - Library for physics-informed neural networks.
Takamoto, M., Praditia, T., Leiteritz, R., MacKinlay, D., Alesiani, F., Pflüger, D., & Niepert, M. (2022) PDEBench: An Extensive Benchmark for Scientific Machine Learning arXiv preprint arXiv:2210.07182. arXiv: 2210.07182 Tool: PDEBench GitHub - Benchmark datasets for PDE learning.

Additional Resources

NVIDIA Modulus: Physics-ML framework for simulations - developer.nvidia.com/modulus
FEniCS Project: Open-source FEM solver - fenicsproject.org
PyTorch Documentation: Deep learning framework - pytorch.org

Note: This project implements a simplified MLP-based surrogate model with physics-informed training data generation. For more advanced applications, consider exploring the full PINN and neural operator frameworks referenced above.

About Akash Acharya

Azure Solution Architect, Full Stack Web Developer, based in Livonia Michigan, USA

Email : akashnacharya@gmail.com

Website :