Skip to content

[APPack] Added Flat Placement Reconstruction #2870

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

AlexandreSinger
Copy link
Contributor

@AlexandreSinger AlexandreSinger commented Jan 23, 2025

Added logic to the packer to reconstruct clusters from the flat placement, if one is provided. The packer will fall-back on the original packing algorithm if it runs out of options for candidates to cluster.

Added logic to the initial placer to reconstruct the placement of each cluster from the flat placement, if one is provided. It will try to place the cluster at the centroid location of all atoms in a cluster, if it cannot place the cluster there it will fall-back on the original initial placement algorithm.

Added a simple testcase to ensure the reading and writing of the flat placement file is working.

This resolves one of the problems discussed in issue #2868

This has been tested on all of the MCNC and VTR_chain benchmarks. The results below were collected by running default VTR (iteration 0), which generated a flat placement file. This flat placement file was passed back into VTR to reconstruct the clustering and initial placement (the SA annealer is then run), and another flat placement file is generated. This process is then iteratively repeated. This was run on fixed-size architectures (large enough so the limiting device utilization was not too high) with fixed channel widths equal to 1.3 times the minimum channel width.

MCNC (average over all benchmarks):

iter_num pack_time(s) device_util total_wl CPD(ns) percent_cluster_errors percent_atoms_displaced average_atom_displacement
0 1.7125 0.551 15915.75 5.7354425 -1 -1 -1
1 1.444 0.5525 15845.75 5.717265 0.0556921 0.11208215 0.8230784
2 1.4045 0.551 15760.75 5.781261 0.0169662 0.0540315 0.497987
3 1.427 0.5505 15741.55 5.7544645 0.00690835 0.0347828 0.4006579
4 1.426 0.5505 15731.45 5.77473 0.00038335 0.03156595 0.3836701
5 1.3945 0.5505 15744.25 5.7533945 0 0.0308358 0.3788053
6 1.509 0.5505 15738 5.704249 0 0.03185395 0.376404
7 1.413 0.5505 15718.6 5.748659 0 0.03113375 0.37807795
8 1.397 0.5505 15771.6 5.721962 0 0.0296978 0.35781635
9 1.428 0.5505 15771.5 5.7173095 0 0.03104095 0.37434115
10 1.414 0.5505 15737.35 5.729995 0 0.03036075 0.3719598

VTR_chain (averaged over all benchmarks):

iter_num pack_time(s) device_util total_wl CPD(ns) percent_cluster_errors percent_atoms_displaced average_atom_displacement
0 37.6647619047619 0.411428571428572 181585.095238095 17.5657523809524 -1 -1 -1
1 31.14 0.410476190476191 180035.095238095 17.5552023809524 0.033686380952381 0.0495172380952381 0.960496380952381
2 31.0114285714286 0.410952380952381 179332.80952381 17.5821566666667 0.0138867619047619 0.0231637142857143 0.446147714285714
3 30.9804761904762 0.41 179152.095238095 17.5575819047619 0.00473233333333333 0.00763919047619048 0.207932
4 30.8338095238095 0.41 179056.428571429 17.5523061904762 0.00275104761904762 0.00427171428571429 0.136488428571429
5 30.8985714285714 0.410476190476191 178975.285714286 17.5100333333333 0.00185947619047619 0.00360457142857143 0.100016333333333
6 30.8866666666667 0.410476190476191 178769.761904762 17.5679404761905 0.00125547619047619 0.00206866666666667 0.0626832380952381
7 30.922380952381 0.410476190476191 178743.047619048 17.5653671428571 0.000850333333333333 0.00172095238095238 0.0530320476190476
8 30.9614285714286 0.410476190476191 178256.904761905 17.5427557142857 0.000358380952380952 0.00104657142857143 0.0277478095238095
9 30.7666666666667 0.410476190476191 178121.80952381 17.5834347619048 0.000158190476190476 0.00040747619047619 0.0110560476190476
10 30.7390476190476 0.410476190476191 177837.571428571 17.5318652380952 3.16666666666667E-05 2.80952380952381E-06 1.87619047619048E-05

The WL and CPD are calculated after running the SA.

The percent cluster errors is the number of clusters which contain an atom that was not placed in the same place as the other atoms in the cluster (according to the flat placement) divided by the total number of clusters in the netlist.

The percent atoms displaced is the number of atoms that were not placed in the correct spot (according to the flat placement) divided by the number of atoms in the netlist.

The average atom displacement is the average displacement of each atom from where it wanted to be in the flat placement.

Based on these results, reconstructing the clusters from a flat placement reduces the runtime of packing by 15%. The error between the reconstructed placement and the flat placement is less than 5% (some circuits are higher than others); however, the WL/CPD achieved remains stable. This implies that although the reconstruction is not perfect, it is still able to achieve similar QoR. This is due to this work falling back on the original packing / placement algorithms. Another thing to notice is that the error reduces over several iterations (eventually also getting 0% error on most circuits). This shows that we are able to reconstruct perfectly, aside from some small issues due to ordering early on (changing the order atoms are inserted into a cluster changes the legality of future insertions).

MCNC appears to have higher error than VTR_chain for reconstruction. I attribute this to the architecture. MCNC is a homogenous architecture (all LUTs). This means that it is likely for the reconstruction to swap atoms between clusters which changes the placement. However, even with this higher average error, the average displacement remains low and the CPD and WL remain stable.

@github-actions github-actions bot added VPR VPR FPGA Placement & Routing Tool lang-cpp C/C++ code labels Jan 23, 2025
@AlexandreSinger
Copy link
Contributor Author

@vaughnbetz This has passed CI. I agree with the improvements discussed in the VTR meeting; however, I would like to merge this version in so I can add it to the AP flow. I can return to this project later to further improve it as needed.

Although it looks like a lot of lines were modified, 900+ lines were added for a strong testcase for this work. Please review when you have a moment!

Copy link
Contributor

@vaughnbetz vaughnbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; a few suggestions.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Added logic to the packer to reconstruct clusters from the flat
placement, if one is provided. The packer will fall-back on the original
packing algorithm if it runs out of options for candidates to cluster.

Added logic to the initial placer to reconstruct the placement of each
cluster from the flat placement, if one is provided. It will try to
place the cluster at the centroid location of all atoms in a cluster, if
it cannot place the cluster there it will fall-back on the original
initial placement algorithm.

Added a simple testcase to ensure the reading and writing of the flat
placement file is working.
Updated comments and some variables names as per VB review.
@AlexandreSinger AlexandreSinger merged commit da66e10 into verilog-to-routing:master Jan 28, 2025
36 checks passed
@AlexandreSinger AlexandreSinger deleted the feature-appack branch January 28, 2025 19:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang-cpp C/C++ code VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants