../
2024-09-21
A Quine is a very interesting type of program; when executed, its output is its own source code. This type of program is also known as a “self-replicating program” and possesses a fascinating self-referential characteristic similar to biological organisms.
The following two examples of Quines were copied from Wikipedia.
Example 1: Python
c = 'c = %r; print(c %% c)'; print(c % c)
Example 2: Java
public class Quine
{
public static void main(String[] args)
{
char q = 34; // Quotation mark character
String[] l = { // Array of source code
"public class Quine",
"{",
" public static void main(String[] args)",
" {",
" char q = 34; // Quotation mark character",
" String[] l = { // Array of source code",
" ",
" };",
" for (int i = 0; i < 6; i++) // Print opening code",
" System.out.println(l[i]);",
" for (int i = 0; i < l.length; i++) // Print string array",
" System.out.println(l[6] + q + l[i] + q + ',');",
" for (int i = 7; i < l.length; i++) // Print this code",
" System.out.println(l[i]);",
" }",
"}",
};
for (int i = 0; i < 6; i++) // Print opening code
System.out.println(l[i]);
for (int i = 0; i < l.length; i++) // Print string array
System.out.println(l[6] + q + l[i] + q + ',');
for (int i = 7; i < l.length; i++) // Print this code
System.out.println(l[i]);
}
}
However, the code in these two examples isn’t very easy to understand. If I wanted to write a Quine in a third language, such as C++ or JavaScript, I would need to rethink the logic.
Therefore, I wanted to find a universal method to write a Quine in the most intuitive and easiest way possible. Although this type of Quine might not be the shortest or most efficient, it is the most intuitive and the easiest to apply to other languages.
Observing the two Quines in the previous section, although they look very different, the process can be summarized into these steps:
Although it sounds simple, in common programming languages, representing strings requires extra escaping for newlines and quotation marks. For example, a newline in a string must be written as \n, and quotation marks need a backslash in front of them. These add unnecessary trouble, which is why the code above is hard to read.
However, these escape symbols are essentially a form of encoding. So, my idea is to simply encode the DNA string into the simplest hexadecimal format directly, and then decode it when it needs to be printed.
This way, no weird hacks are needed.
Furthermore, it doesn’t strictly have to be hexadecimal; Base64, Base32, or even using nucleotide letters like real DNA would work.
To encode the string into hexadecimal, I wrote a small Python tool that reads a string and outputs its hex form:
import sys
s = sys.stdin.read()
if s[-1] == '\n':
s = s[:-1]
print(s.encode('utf-8').hex())
Save the above code as hexencode.py, and you can use it in the Shell like this:
cat input.txt | python3 hexencode.py
Python is relatively simple to write, so let’s start with Python.
First, define the DNA string. Since we don’t know the content of the “DNA” yet, let’s use emoji symbols as placeholders. Since this string contains two parts: the head and the tail, let’s assume the head is a cat and the tail is a snake:
dna = '🐱,🐍'
Then extract the head and tail:
head, tail = dna.split(',')
Since we plan to use hex encoding, we need to decode the head and tail from hex back to their original form here:
head = bytes.fromhex(head).decode('utf-8')
tail = bytes.fromhex(tail).decode('utf-8')
Finally, we join the head, DNA, and tail together and output them:
print(head + dna + tail)
The program now looks like this:
dna = '🐱,🐍'
head, tail = dna.split(',')
head = bytes.fromhex(head).decode('utf-8')
tail = bytes.fromhex(tail).decode('utf-8')
print(head + dna + tail)
The part before the cat 🐱 head is:
dna = '
Using the nifty little tool to encode it, we get:
646e61203d2027
Replace the 🐱 with this string.
Next, the part after the snake 🐍 tail is:
'
head, tail = dna.split(',')
head = bytes.fromhex(head).decode('utf-8')
tail = bytes.fromhex(tail).decode('utf-8')
print(head + dna + tail)
Using the nifty little tool to encode it, we get:
270a686561642c207461696c203d20646e612e73706c697428272c27290a68656164203d2062797465732e66726f6d6865782868656164292e6465636f646528277574662d3827290a7461696c203d2062797465732e66726f6d686578287461696c292e6465636f646528277574662d3827290a7072696e742868656164202b20646e61202b207461696c29
Replace the 🐍 with this string.
Finally, we get the following code:
dna = '646e61203d2027,270a686561642c207461696c203d20646e612e73706c697428272c27290a68656164203d2062797465732e66726f6d6865782868656164292e6465636f646528277574662d3827290a7461696c203d2062797465732e66726f6d686578287461696c292e6465636f646528277574662d3827290a7072696e742868656164202b20646e61202b207461696c29'
head, tail = dna.split(',')
head = bytes.fromhex(head).decode('utf-8')
tail = bytes.fromhex(tail).decode('utf-8')
print(head + dna + tail)
At this point, a Quine is complete. Run it and give it a try.
Let’s use C++ as an example here.
First, write a similar template:
#include <iostream>
#include <string>
void split(std::string input, std::string &first, std::string &second);
std::string hex_decode(std::string hex);
int main() {
std::string dna = "🐱,🐍";
std::string head, tail;
split(dna, head, tail);
head = hex_decode(head);
tail = hex_decode(tail);
std::cout << head << dna << tail << std::endl;
}
// The annoying thing is that the C++ standard library doesn't provide split and hex decode functions.
// I'm too lazy to write them myself, so I'll just let AI generate them:
void split(std::string input, std::string &first, std::string &second) {
size_t commaPos = input.find(',');
if (commaPos != std::string::npos) {
first = input.substr(0, commaPos);
second = input.substr(commaPos + 1);
} else {
first = input;
second = "";
}
}
std::string hex_decode(std::string input) {
std::string output;
output.reserve(input.size() / 2);
for (size_t i = 0; i < input.size(); i += 2) {
std::string byteString = input.substr(i, 2);
char byte = static_cast<char>(std::strtol(byteString.c_str(), nullptr, 16));
output.push_back(byte);
}
return output;
}
Then, take the program text before the 🐱 and the program text after the 🐍, encode them into hex using the nifty little tool, and replace them. We get:
#include <iostream>
#include <string>
void split(std::string input, std::string &first, std::string &second);
std::string hex_decode(std::string hex);
int main() {
std::string dna = "23696e636c756465203c696f73747265616d3e0a23696e636c756465203c737472696e673e0a0a766f69642073706c6974287374643a3a737472696e6720696e7075742c207374643a3a737472696e67202666697273742c207374643a3a737472696e6720267365636f6e64293b0a7374643a3a737472696e67206865785f6465636f6465287374643a3a737472696e6720686578293b0a0a696e74206d61696e2829207b0a20207374643a3a737472696e6720646e61203d2022,223b0a20200a20207374643a3a737472696e6720686561642c207461696c3b0a202073706c697428646e612c20686561642c207461696c293b0a202068656164203d206865785f6465636f64652868656164293b0a20207461696c203d206865785f6465636f6465287461696c293b0a20200a20207374643a3a636f7574203c3c2068656164203c3c20646e61203c3c207461696c203c3c207374643a3a656e646c3b0a7d0a0a2f2f20e9babbe783a6e79a84e59cb0e696b9e698af432b2be6a087e58786e5ba93e9878ce99da2e4b88de68f90e4be9be58886e589b2e5928c3136e8bf9be588b6e8a7a3e7a081e587bde695b00a2f2f20e68891e4b99fe68792e5be97e58699e4ba86efbc8ce79bb4e68ea54149e7949fe68890e4b880e4b88b3a0a0a766f69642073706c6974287374643a3a737472696e6720696e7075742c207374643a3a737472696e67202666697273742c207374643a3a737472696e6720267365636f6e6429207b0a2020202073697a655f7420636f6d6d61506f73203d20696e7075742e66696e6428272c27293b0a2020202069662028636f6d6d61506f7320213d207374643a3a737472696e673a3a6e706f7329207b0a20202020202020206669727374203d20696e7075742e73756273747228302c20636f6d6d61506f73293b0a20202020202020207365636f6e64203d20696e7075742e73756273747228636f6d6d61506f73202b2031293b0a202020207d20656c7365207b0a20202020202020206669727374203d20696e7075743b0a20202020202020207365636f6e64203d2022223b0a202020207d0a7d0a0a7374643a3a737472696e67206865785f6465636f6465287374643a3a737472696e6720696e70757429207b0a202020207374643a3a737472696e67206f75747075743b0a202020206f75747075742e7265736572766528696e7075742e73697a652829202f2032293b0a20202020666f72202873697a655f742069203d20303b2069203c20696e7075742e73697a6528293b2069202b3d203229207b0a20202020202020207374643a3a737472696e672062797465537472696e67203d20696e7075742e73756273747228692c2032293b0a2020202020202020636861722062797465203d207374617469635f636173743c636861723e287374643a3a737472746f6c2862797465537472696e672e635f73747228292c206e756c6c7074722c20313629293b0a20202020202020206f75747075742e707573685f6261636b2862797465293b0a202020207d0a2020202072657475726e206f75747075743b0a7d";
std::string head, tail;
split(dna, head, tail);
head = hex_decode(head);
tail = hex_decode(tail);
std::cout << head << dna << tail << std::endl;
}
// The annoying thing is that the C++ standard library doesn't provide split and hex decode functions.
// I'm too lazy to write them myself, so I'll just let AI generate them:
void split(std::string input, std::string &first, std::string &second) {
size_t commaPos = input.find(',');
if (commaPos != std::string::npos) {
first = input.substr(0, commaPos);
second = input.substr(commaPos + 1);
} else {
first = input;
second = "";
}
}
std::string hex_decode(std::string input) {
std::string output;
output.reserve(input.size() / 2);
for (size_t i = 0; i < input.size(); i += 2) {
std::string byteString = input.substr(i, 2);
char byte = static_cast<char>(std::strtol(byteString.c_str(), nullptr, 16));
output.push_back(byte);
}
return output;
}
Save this C++ source code file as quine.cpp, and then you can use the following command to verify if its output is identical to the original source code:
g++ quine.cpp && ./a.out | diff quine.cpp -
There is another type of Quine where a program written in Language A outputs source code in Language B, and this Language B source code, when run, outputs the original Language A code, forming a cycle from A to B back to A. This process can even include more languages, such as A -> B -> C -> D -> E -> A.
There is a project on GitHub that constructs a cycle of over 100 languages.
After using our “dumb method” with hexadecimal encoding, since we no longer need to consider escape character issues, we can easily construct this for any number of languages.
Next, let’s combine the Python and C++ examples above to construct a C++ program that outputs a Python program, which when run, outputs the original C++ program.
First, write the Python version template, which is the same as above:
dna = '🐱,🐍'
head, tail = dna.split(',')
head = bytes.fromhex(head).decode('utf-8')
tail = bytes.fromhex(tail).decode('utf-8')
print(head + dna + tail)
Then, write a C++ program that outputs the above template:
#include <iostream>
#include <string>
std::string py =
"dna = '🐱,🐍'\n"
"head, tail = dna.split(',')\n"
"head = bytes.fromhex(head).decode('utf-8')\n"
"tail = bytes.fromhex(tail).decode('utf-8')\n"
"print(head + dna + tail)\n";
int main() {
std::cout << py;
return 0;
}
Then, take the program text before the 🐱 and the program text after the 🐍 in this C++ code, encode them into hex using the nifty little tool, and replace them. We get:
#include <iostream>
#include <string>
std::string py =
"dna = '23696e636c756465203c696f73747265616d3e0a23696e636c756465203c737472696e673e0a0a7374643a3a737472696e67207079203d0a2020202022646e61203d2027,275c6e220a2020202022686561642c207461696c203d20646e612e73706c697428272c27295c6e220a202020202268656164203d2062797465732e66726f6d6865782868656164292e6465636f646528277574662d3827295c6e220a20202020227461696c203d2062797465732e66726f6d686578287461696c292e6465636f646528277574662d3827295c6e220a20202020227072696e742868656164202b20646e61202b207461696c295c6e223b0a0a696e74206d61696e2829207b0a202020207374643a3a636f7574203c3c2070793b0a2020202072657475726e20303b0a7d'\n"
"head, tail = dna.split(',')\n"
"head = bytes.fromhex(head).decode('utf-8')\n"
"tail = bytes.fromhex(tail).decode('utf-8')\n"
"print(head + dna + tail)\n";
int main() {
std::cout << py;
return 0;
}
Save this code as quine.cpp, and verify the result:
g++ quine.cpp
./a.out > quine.py
python3 quine.py > quine2.cpp
diff quine.cpp quine2.cpp
Done.
Mistivia - https://mistivia.com