Did you do something similar with this PDF, or did you not run into such issues for some reason? If you used one version as the "master" reference for line numbering, which did you use?
Yonge's and Hick's translations are mostly aligned the same. Bailey took more liberties with his translation (including 2 sections where he decided to move large chunks of passages to another sections - 46 and 47 were transferred to 61 and 62 respectively).
For these reasons, my aliment algorithm looked like this:
1) if Y and H are the same but B is not - align B to match the rest
2) if Y and H are different, compare to B and if Y or H match B, go with the majority for alignment
3) if all are different, go to https://logeion.uchicago.edu/ translate Greek words at the end of one section and the beginning of the following section to determine the correct split
Tau Phi I know what you've already done has been a huge amount of work, but let me ask this: Does the method of assembly you used make it possible with reasonable effort to:
1 - Do the same thing for the rest of DIogenes Laertius so that we have the full Book X in one place? That makes it much easier for word searching.
2 - In cutting and pasting from the PDF I am seeing a problem that I've had with other PDFs of my own in the past -- there's something wrong with the constructions involving "f" that corrupts the words.
1- It's very much doable. I've done 112 out of 154 sections already so there are only 42 sections left in Book X (27%). As far as I know, all three translators did the entirety of Book X so I don't see any reasons why this couldn't be done. I decided to do only Epicurus' sections because I wanted core texts in one place without any fluff.
2- I used Ghostscript to add ToC to the document and it looks like there's some issue with the characters' encoding during recompilation process. I'll try to find another way to do it. In the meantime, please use my initial file without ToC. It should work without any issues.