Find duplicated code with copy/paste detection (CPD)
This was some quick research on something that I found interesting. Was trying to solve a problem, but this might be overkill. It is intersting though!
CPD
Duplicate code can be hard to find, especially in a large project. But PMD’s Copy/Paste Detector (CPD) can find it for you!
CPD works with Java, JSP, C/C++, C#, Go, Kotlin, Ruby, Swift and many more languages. It can be used via command-line, or via an Ant task. It can also be run with Maven by using the cpd-check goal on the Maven PMD Plugin.
Your own language is missing? See how to add it here.
It says it support Python, EcmaScript(JavaScript), C/C++ and more, but these are what I would want, so ...
It seems to be a command-line tool, so should be easy to use.
other solutions
- A Usable Copy-Paste Detector in A Few Lines of Python
- a description on how to do this
- they used some library (can't remember now) and did this with Python 2.x
- was a hybrid system that used tokens rather than ASTs like many others
- not sure if it would be doable with Python 3.x, but ... Would take some time to see.
- jscpd
- it says it supports JavaScript, Python, C++, and others. Looks like it's a Node based tool, so ...
- It also has a CLI.
- looks like there is a VSCode plugin that can be added
- jsinspect
- Ranga pointed me to this one, which started all of this. This tool seems to be only JavaScript though
- apparently there is a browser plug-in for this
- lizard
- homepage - can enter code here on browser for "Try Lizard in Your Browser"
- released in 2020/10
- Supports Python, JS, C/C++, and others.
- Does more than copy/paste duplication detection.
Might be something to come back to at a later date.