About
The CAD-recode is a model that outputs a code sequence to generate CAD data sequences from point clouds. The LLM used is Qwen2. At first glance, the handling of this data may seem redundant, but its performance is reported to be exceptionally good. It is unclear whether this impressive performance is due to the capabilities of the language model or the power of data synthesis described in the paper, but the results appear intriguing.
Their pre-trained model has been made available on Hugging Face, allowing anyone to try it out. How well can their model handle complex three-dimensional shapes? This remains a fascinating question to explore.
Data
I decided to test how well the model performs with very simple shapes. In one of my previous articles, I wrote about CAD design using FreeCAD.
FreeCAD CAM WorkBench
For this experiment, I plan to convert a STEP file created there into an STL file, and then use CAD-Recode to convert this STL data back into a STEP file. The original shape looks like this:
I counted the number of vertices in stl file that is converted from .step using FreeCAD.
import trimesh
stl_file = 'CAM-work1-Body001.stl'
mesh = trimesh.load_mesh(stl_file)
unique_vertices = mesh.vertices
print(unique_vertices.shape)
(528, 3)
The shape of array is (528, 3), and it captures the feature enough I felt.
Result
Generated Strings
r=w0.workplane(offset=5/2).cylinder(5,46).union(w0.sketch().rect(100,100).reset().face(w0.sketch().segment((-40,-40),(-10,-40)).arc((0,-42),(10,-40)).segment((40,-40)).segment((40,40)).segment((10,40)).arc((0,42),(-10,40)).segment((-40,40)).close().assemble(),mode='s').push([(-35,-36)]).circle(2,mode='s').finalize().extrude(14))
The length of string is 330.
Generated STEP file from Strings
While the overall shape is similar, there are issues with finer details: the bottom of the box has holes, there are distortions, and chamfers are missing. It seems that the model struggles to fully replicate the original shape. I still don’t understand how to achieve a more accurate reproduction of the original shape, but I plan to conduct a parameter study to explore this further in the next step.
Metrics
The Chamfer Distance and IoU between target and input is CD: 0.313, IoU: 0.839.
Consideration
Why is the model failing to accurately capture the shape?
Several possible causes can be considered, but it seems that the fundamental issue of insufficient representational power lies in the following factors:
- The number of points in the point cloud
- The maximum length of input tokens
Particularly, the number of points in the point cloud. By examining the code, the process of converting the mesh into a point cloud includes the following operation.
def mesh_to_point_cloud(mesh, n_points=256):
vertices, faces = trimesh.sample.sample_surface(mesh, n_points)
point_cloud = np.concatenate((
np.asarray(vertices),
mesh.face_normals[faces]
), axis=1)
ids = np.lexsort((point_cloud[:, 0], point_cloud[:, 1], point_cloud[:, 2]))
point_cloud = point_cloud[ids]
return point_cloud
This is part of the 3D point cloud encoding method described in a previous article.
By default, the number of points in the point cloud is limited to only 256.
Additional Experiments
When n_points = 400
w0=cq.Workplane('XY',origin=(0,0,-7))
r=w0.workplane(offset=5/2).moveTo(-1,0).cylinder(5,46).union(w0.sketch().segment((-50,-50),(50,-50)).segment((50,50)).segment((-50,50)).segment((-50,41)).arc((-44,44),(-39,40)).segment((39,40)).segment((39,-40)).segment((-38,-40)).arc((-43,-34),(-48,-29)).segment((-50,-29)).close().assemble().push([(26,-31)]).circle(14,mode='s').finalize().extrude(15))
the length of string is 414 and CD: 0.890, IoU: 0.730
When n_points = 512
w0=cq.Workplane('XY',origin=(0,0,-7))
r=w0.sketch().segment((-50,-50),(46,-50)).arc((48,-47),(50,-44)).segment((50,44)).arc((48,47),(46,50)).segment((-50,50)).segment((-50,44)).arc((-49,42),(-50,40)).close().assemble().reset().face(w0.sketch().segment((-40,-40),(30,-40)).arc((35,-39),(40,-35)).segment((40,40)).segment((-40,40)).segment((-40,35)).arc((-43,27),(-40,19)).segment((-40,-27)).arc((-35,-39),(-40,-28)).close().assemble(),mode='s').finalize().extrude(14) 488
When n_points = 1024
w0=cq.Workplane('XY',origin=(0,0,-7))
r=w0.sketch().arc((-50,-43),(-47,-47),(-44,-50)).segment((44,-50)).arc((47,-47),(50,-44)).segment((50,44)).arc((47,47),(44,50)).segment((-44,50)).arc((-47,47),(-50,44)).close().assemble().reset().face(w0.sketch().segment((-40,-36),(-40,36)).arc((-41,0),(-40,-36)).assemble(),mode='s').rect(80,80,mode='s').reset().face(w0.sketch().arc((40,-36),(41,-36),(42,-35)).arc((40,-34),(40,-32)).close().assemble(),mode='s').reset().face(w0.sketch().arc((42,35),(43,34),(44,32)).arc((45,34),(47,35)).close().assemble(),mode='s').finalize().extrude(14)
Although the code itself is successfully generated as shown above when n_points
is set to 512 or 1024, but However, the following error occurred:
exec(py_string, globals())
File "<string>", line 3, in <module>
File "/opt/conda/lib/python3.11/site-packages/multimethod/__init__.py", line 375, in __call__
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/cadquery/sketch.py", line 874, in arc
val = Edge.makeThreePointArc(Vector(p1), Vector(p2), Vector(p3))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/cadquery/occ_impl/shapes.py", line 2328, in makeThreePointArc
).Value()
^^^^^^^
It seems that the generated code produces a shape that is not valid.