<p dir="ltr">Effective crime linkage analysis is crucial for identifying serial</p><p dir="ltr">offenders and enhancing public safety. To address the limi-</p><p dir="ltr">tations of traditional crime linkage methods when handling</p><p dir="ltr">high-dimensional, sparse, and heterogeneous data, this paper</p><p dir="ltr">proposes a Siamese Autoencoder framework to learn meaning-</p><p dir="ltr">ful latent representations and uncover correlations in highly</p><p dir="ltr">complex data. Using a dataset from the Violent Crime Linkage</p><p dir="ltr">Analysis System—a database maintained by the Serious Crime</p><p dir="ltr">Analysis Section of the UK’s National Crime Agency—our</p><p dir="ltr">approach mitigates signal dilution in high-dimensional sparse</p><p dir="ltr">data through decoder-stage integration of geographic-temporal</p><p dir="ltr">features. This integration amplifies learned behavioral repre-</p><p dir="ltr">sentations rather than allowing them to be overwhelmed at the</p><p dir="ltr">input stage, leading to consistent improvements over baseline</p><p dir="ltr">methods across multiple metrics. We further examine how</p><p dir="ltr">different data reduction strategies based on domain-expert</p><p dir="ltr">can impact model performance, offering practical insights</p><p dir="ltr">into preprocessing for crime linkage. Our solution shows that</p><p dir="ltr">advanced machine learning approaches can enhance linkage</p><p dir="ltr">accuracy, improving AUC by up to 9% over traditional meth-</p><p dir="ltr">ods and providing insights to support human decision-making</p><p dir="ltr">in crime investigation.</p>