Here is a sorted text file with one group of homoglyphs on each line.The list of homoglyphs I used is based on the one that appears on the Unicode Consortium website however this list, although long, was incomplete, so I added some further pairs found thanks to I have tried to compile a list of all the homoglyphs I could find, and to make the list useful by processing it in various ways to make it easier to use in software. However, someone wishing to use one of the black-listed words could replace one of its letters with a homoglyph - the word would no longer match the one on the black-list, but its meaning would still be apparent to anyone who saw it. For example, if a social media website wants to protect its users from offensive language it may create a 'black-list' of forbidden words, and block any content that contains them. These days, however, the internet runs on Unicode which means that it is possible to mix the letters from many different languages together in one place, massively increasing the number of homoglyphs.įor example, each of the characters shown below (all rendered using the same font) are different, with their own unique Unicode codepoint values, but they all look more-or-less like the capital letter 'A':Ī Α А Ꭺ ᗅ ᴀ ꓮ A □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □ □Īs well as creating general confusion, homoglyphs can cause particular problems for software developers. Homoglyphs within a single alphabet tend to be rare for obvious reasons. Homoglyphs are characters with different meanings, that look similar/identical to each other - like the digit '0' and the capital letter 'O' for example. search ( textToSearch, bannedWords ) Background Var homoglyphSearch = require ( 'homoglyph-search' ) var bannedWords = var textToSearch = 'Get free ϲrEd1ᴛ' var results = homoglyphSearch.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |