SpaceLM 🌌

📖 Introduction

Inspired by SpatialLM, SpaceLM is a 3D reconstruction system that can be used to reconstruct 3D models from 2D images and videos. We will improve the code and the pretrained model. Stay tuned for more details and improvements.

TODO

Release train dataset
Release dataset preprocess code
Release train code
Release the pretrained model (Apr 10)
Add Inference and Visualize Demo (Apr 15)

📦 Environment

Train Platform

We use one A100 GPUs to train the model.

Python Environment

Please follow the steps below to create the environment and install the dependencies.

git clone https://github.com/sengine-research/SpaceLM.git
cd SpaceLM

conda create -n spacelm python=3.10
conda activate spacelm

# CUDA version 12.4
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
pip install xformers==v0.0.29.post2 --index-url https://download.pytorch.org/whl/cu124 # install xformers from table below
# CUDA version 11.8
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install xformers==v0.0.29.post2 --index-url https://download.pytorch.org/whl/cu118 # install xformers from table below

pip install -r requirements.txt
pip install git+https://github.com/mit-han-lab/torchsparse.git # may take a long time

# install spconv cuda 12.4
pip install spconv-cu124
# install spconv cuda 11.8
pip install spconv-cu118

#Installation Problems Integration
![1744853738699](https://github.com/user-attachments/assets/9023f16b-5916-406f-847c-e110c2432cf6)
# 1.install torch_scatter；hitting with "ERROR: Failed building wheel for torch_scatter", it's the reason that you cuda and pytorch version problem, just go with this:
pip install torch torchvision torchaudio -f https://mirrors.aliyun.com/pytorch-wheels
# 2.install torch_sparse: you might encounter with this "  note: This error originates from a subprocess, and is likely not a problem with pip.
#  ERROR: Failed building wheel for torchsparse
#  Running setup.py clean for torchsparse
#  Failed to build torchsparse
# I solve this with as following:
a: git clone https://github.com/mit-han-lab/torchsparse.git
b: cd torchsparse/
c: python setup.py install # if you meet "RuntimeError: Error compiling objects for extension" try d if not please ignore
d sudo apt-get install libsparsehash-dev # can be ignore if you don't meet ③ 
e python setup.py install # if you meet "[Errno 101] Network is unreachable" just try execute this command line repeatedly, sometimes the network is not very well, but try it can be solved.
# Finally you will this this: Finished processing dependencies for torchsparse==2.1.0

xformer table

install xformers from the table below

eg: pip install xformers==v0.0.29.post2 --index-url https://download.pytorch.org/whl/cu118

xformers	pytorch	CUDA
v0.0.29.post2	torch==2.6.0	cu118, cu124, cu126
0.0.29.post1, 0.0.29, 0.0.28.post3	torch==2.5.1	cu118, cu121, cu124
0.0.28.post2	torch==2.5.0	cu118, cu121, cu124
0.0.28.post1	torch==2.4.1	cu118, cu121, cu124

🚀 Quick Start

Gradio Demo (TODO)

python app.py

Inference and Visualize Demo (PRE_TRAINED_MODEL_PATH TODO)

# Inference
python inference.py --model_path PRE_TRAINED_MODEL_PATH --point_cloud sample_data/scene0000_00/scene0000_00_pc_result.ply -o test.txt

# Save the result as rrd file
python visualize.py --point_cloud sample_data/scene0000_00/scene0000_00_pc_result.ply --layout test.txt --save test.rrd

# Open the rrd file in Windows system
pip install rerun-sdk
rerun test.rrd

🔍 Train Pipeline

Note: 48GB VRAM GPU machine preferred and 5 hours training time needed

Download Qwen2.5-0.5B-Instruct and Modify the Vision Token to Point Token

modelscope download --model qwen/Qwen2.5-0.5B-Instruct --local_dir ./Qwen2.5-0.5B-Instruct

python model/modify_qwen2.5.py --model_path ./Qwen2.5-0.5B-Instruct # modify the vision token to point token

Download Scenescript Pretrained Model

Download scenescript_model_ase.ckpt and put scenescript_model_ase.ckpt to the root dir of the project.

Train on Scannet Dataset (For Example)

Thanks to VLA-3D, we can get six most popular open source dataset (Scannet / Matterport / HM3D / Unity / ARKitScenes / 3RScan) with same format for easy training. We use Scannet as an example to show how to train the model. For other dataset(Matterport / HM3D / Unity / ARKitScenes / 3RScan), you can refer to the code and train it by yourself.

Download Dataset

git lfs install
git clone https://huggingface.co/datasets/sengine-research/preprocessed-vla-3d

mkdir 3D_dataset
unzip ./preprocessed-vla-3d/Scannet.zip -d ./3D_dataset/

Preprocess Data

python dataset/preprocess_data_scene_script.py --data_path 3D_dataset --dataset_name Scannet

After preprocess, you will get the preprocessed data in the folder preprocessed_data_scene_script.

Train

python train.py  --dataset_dir preprocessed_data_scene_script --dataset_name Scannet --model_path ./Qwen2.5-0.5B-Instruct --exp_path YOUR_EXP_PATH --exp_name YOUR_EXP_NAME --stage_1_epochs EPOCH_NUM --stage_2_epochs EPOCH_NUM --batch_size BATCH_SIZE --gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS --learning_rate LEARNING_RATE --save_per_epoch SAVE_PER_EPOCH

# example
python train.py  --dataset_dir preprocessed_data_scene_script --dataset_name Scannet --model_path ./Qwen2.5-0.5B-Instruct --exp_path ./exp --exp_name space_lm_model_qwen_llm_lr_1e-6_point_lr_1e-5 --stage_1_epochs 4 --stage_2_epochs 10 --batch_size 1 --gradient_accumulation_steps 16 --learning_rate 5e-6 --save_per_epoch 2

We use two stages to train the model. The first stage is to train the point backbone model and the second stage is to train the whole model on the Scannet dataset.

Inference and Visualize

python inference.py --model_path YOUR_EXP_PATH/YOUR_EXP_NAME/EPOCH_NUM --point_cloud PLY_FILE -o OUTPUT_FILE

python visualize.py --point_cloud PLY_FILE --layout OUTPUT_FILE --save OUTPUT_FILE.rrd

rerun OUTPUT_FILE.rrd # better on windows

# example
python inference.py --model_path exp/space_lm_model_qwen_llm_lr_1e-5_point_lr_1e-4_no_stage_1_Scannet/stage_2/epoch_0 --point_cloud sample_data/scene0000_00/scene0000_00_pc_result.ply -o test.txt

python visualize.py --point_cloud sample_data/scene0000_00/scene0000_00_pc_result.ply --layout test.txt --save test.rrd

rerun test.rrd  # better on windows

📚 Contributing

We welcome contributions in the following ways:

Submit an Issue to report problems
Create a Pull Request to improve the code
Complete the project documentation
Share your usage examples

🤝 Acknowledgements

This work is inspired by the following projects:

SpatialLM | Qwen2.5 | SceneScript | VLA-3D

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
SLAM3R		SLAM3R
assets		assets
dataset		dataset
layout		layout
model		model
pcd		pcd
sample_data		sample_data
sonata		sonata
test_code		test_code
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
SpaceLM_chinese_translate.pdf		SpaceLM_chinese_translate.pdf
app.py		app.py
code_template.txt		code_template.txt
inference.py		inference.py
inference.sh		inference.sh
requirements.txt		requirements.txt
train.py		train.py
train.sh		train.sh
train_utils.py		train_utils.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpaceLM 🌌

📖 Introduction

TODO

📦 Environment

Train Platform

Python Environment

xformer table

🚀 Quick Start

Gradio Demo (TODO)

Inference and Visualize Demo (PRE_TRAINED_MODEL_PATH TODO)

🔍 Train Pipeline

Download Qwen2.5-0.5B-Instruct and Modify the Vision Token to Point Token

Download Scenescript Pretrained Model

Train on Scannet Dataset (For Example)

Download Dataset

Preprocess Data

Train

Inference and Visualize

📚 Contributing

🤝 Acknowledgements

About

Releases

Packages

Languages

License

Zero-coder/SpaceLM

Folders and files

Latest commit

History

Repository files navigation

SpaceLM 🌌

📖 Introduction

TODO

📦 Environment

Train Platform

Python Environment

xformer table

🚀 Quick Start

Gradio Demo (TODO)

Inference and Visualize Demo (PRE_TRAINED_MODEL_PATH TODO)

🔍 Train Pipeline

Download Qwen2.5-0.5B-Instruct and Modify the Vision Token to Point Token

Download Scenescript Pretrained Model

Train on Scannet Dataset (For Example)

Download Dataset

Preprocess Data

Train

Inference and Visualize

📚 Contributing

🤝 Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages