mirror of
https://github.com/rembo10/headphones.git
synced 2026-04-21 20:39:27 +01:00
Upgraded unidecode to 0.04.17
This commit is contained in:
339
lib/unidecode/LICENSE
Normal file
339
lib/unidecode/LICENSE
Normal file
@@ -0,0 +1,339 @@
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
Version 2, June 1991
|
||||
|
||||
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
|
||||
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
Preamble
|
||||
|
||||
The licenses for most software are designed to take away your
|
||||
freedom to share and change it. By contrast, the GNU General Public
|
||||
License is intended to guarantee your freedom to share and change free
|
||||
software--to make sure the software is free for all its users. This
|
||||
General Public License applies to most of the Free Software
|
||||
Foundation's software and to any other program whose authors commit to
|
||||
using it. (Some other Free Software Foundation software is covered by
|
||||
the GNU Lesser General Public License instead.) You can apply it to
|
||||
your programs, too.
|
||||
|
||||
When we speak of free software, we are referring to freedom, not
|
||||
price. Our General Public Licenses are designed to make sure that you
|
||||
have the freedom to distribute copies of free software (and charge for
|
||||
this service if you wish), that you receive source code or can get it
|
||||
if you want it, that you can change the software or use pieces of it
|
||||
in new free programs; and that you know you can do these things.
|
||||
|
||||
To protect your rights, we need to make restrictions that forbid
|
||||
anyone to deny you these rights or to ask you to surrender the rights.
|
||||
These restrictions translate to certain responsibilities for you if you
|
||||
distribute copies of the software, or if you modify it.
|
||||
|
||||
For example, if you distribute copies of such a program, whether
|
||||
gratis or for a fee, you must give the recipients all the rights that
|
||||
you have. You must make sure that they, too, receive or can get the
|
||||
source code. And you must show them these terms so they know their
|
||||
rights.
|
||||
|
||||
We protect your rights with two steps: (1) copyright the software, and
|
||||
(2) offer you this license which gives you legal permission to copy,
|
||||
distribute and/or modify the software.
|
||||
|
||||
Also, for each author's protection and ours, we want to make certain
|
||||
that everyone understands that there is no warranty for this free
|
||||
software. If the software is modified by someone else and passed on, we
|
||||
want its recipients to know that what they have is not the original, so
|
||||
that any problems introduced by others will not reflect on the original
|
||||
authors' reputations.
|
||||
|
||||
Finally, any free program is threatened constantly by software
|
||||
patents. We wish to avoid the danger that redistributors of a free
|
||||
program will individually obtain patent licenses, in effect making the
|
||||
program proprietary. To prevent this, we have made it clear that any
|
||||
patent must be licensed for everyone's free use or not licensed at all.
|
||||
|
||||
The precise terms and conditions for copying, distribution and
|
||||
modification follow.
|
||||
|
||||
GNU GENERAL PUBLIC LICENSE
|
||||
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
|
||||
|
||||
0. This License applies to any program or other work which contains
|
||||
a notice placed by the copyright holder saying it may be distributed
|
||||
under the terms of this General Public License. The "Program", below,
|
||||
refers to any such program or work, and a "work based on the Program"
|
||||
means either the Program or any derivative work under copyright law:
|
||||
that is to say, a work containing the Program or a portion of it,
|
||||
either verbatim or with modifications and/or translated into another
|
||||
language. (Hereinafter, translation is included without limitation in
|
||||
the term "modification".) Each licensee is addressed as "you".
|
||||
|
||||
Activities other than copying, distribution and modification are not
|
||||
covered by this License; they are outside its scope. The act of
|
||||
running the Program is not restricted, and the output from the Program
|
||||
is covered only if its contents constitute a work based on the
|
||||
Program (independent of having been made by running the Program).
|
||||
Whether that is true depends on what the Program does.
|
||||
|
||||
1. You may copy and distribute verbatim copies of the Program's
|
||||
source code as you receive it, in any medium, provided that you
|
||||
conspicuously and appropriately publish on each copy an appropriate
|
||||
copyright notice and disclaimer of warranty; keep intact all the
|
||||
notices that refer to this License and to the absence of any warranty;
|
||||
and give any other recipients of the Program a copy of this License
|
||||
along with the Program.
|
||||
|
||||
You may charge a fee for the physical act of transferring a copy, and
|
||||
you may at your option offer warranty protection in exchange for a fee.
|
||||
|
||||
2. You may modify your copy or copies of the Program or any portion
|
||||
of it, thus forming a work based on the Program, and copy and
|
||||
distribute such modifications or work under the terms of Section 1
|
||||
above, provided that you also meet all of these conditions:
|
||||
|
||||
a) You must cause the modified files to carry prominent notices
|
||||
stating that you changed the files and the date of any change.
|
||||
|
||||
b) You must cause any work that you distribute or publish, that in
|
||||
whole or in part contains or is derived from the Program or any
|
||||
part thereof, to be licensed as a whole at no charge to all third
|
||||
parties under the terms of this License.
|
||||
|
||||
c) If the modified program normally reads commands interactively
|
||||
when run, you must cause it, when started running for such
|
||||
interactive use in the most ordinary way, to print or display an
|
||||
announcement including an appropriate copyright notice and a
|
||||
notice that there is no warranty (or else, saying that you provide
|
||||
a warranty) and that users may redistribute the program under
|
||||
these conditions, and telling the user how to view a copy of this
|
||||
License. (Exception: if the Program itself is interactive but
|
||||
does not normally print such an announcement, your work based on
|
||||
the Program is not required to print an announcement.)
|
||||
|
||||
These requirements apply to the modified work as a whole. If
|
||||
identifiable sections of that work are not derived from the Program,
|
||||
and can be reasonably considered independent and separate works in
|
||||
themselves, then this License, and its terms, do not apply to those
|
||||
sections when you distribute them as separate works. But when you
|
||||
distribute the same sections as part of a whole which is a work based
|
||||
on the Program, the distribution of the whole must be on the terms of
|
||||
this License, whose permissions for other licensees extend to the
|
||||
entire whole, and thus to each and every part regardless of who wrote it.
|
||||
|
||||
Thus, it is not the intent of this section to claim rights or contest
|
||||
your rights to work written entirely by you; rather, the intent is to
|
||||
exercise the right to control the distribution of derivative or
|
||||
collective works based on the Program.
|
||||
|
||||
In addition, mere aggregation of another work not based on the Program
|
||||
with the Program (or with a work based on the Program) on a volume of
|
||||
a storage or distribution medium does not bring the other work under
|
||||
the scope of this License.
|
||||
|
||||
3. You may copy and distribute the Program (or a work based on it,
|
||||
under Section 2) in object code or executable form under the terms of
|
||||
Sections 1 and 2 above provided that you also do one of the following:
|
||||
|
||||
a) Accompany it with the complete corresponding machine-readable
|
||||
source code, which must be distributed under the terms of Sections
|
||||
1 and 2 above on a medium customarily used for software interchange; or,
|
||||
|
||||
b) Accompany it with a written offer, valid for at least three
|
||||
years, to give any third party, for a charge no more than your
|
||||
cost of physically performing source distribution, a complete
|
||||
machine-readable copy of the corresponding source code, to be
|
||||
distributed under the terms of Sections 1 and 2 above on a medium
|
||||
customarily used for software interchange; or,
|
||||
|
||||
c) Accompany it with the information you received as to the offer
|
||||
to distribute corresponding source code. (This alternative is
|
||||
allowed only for noncommercial distribution and only if you
|
||||
received the program in object code or executable form with such
|
||||
an offer, in accord with Subsection b above.)
|
||||
|
||||
The source code for a work means the preferred form of the work for
|
||||
making modifications to it. For an executable work, complete source
|
||||
code means all the source code for all modules it contains, plus any
|
||||
associated interface definition files, plus the scripts used to
|
||||
control compilation and installation of the executable. However, as a
|
||||
special exception, the source code distributed need not include
|
||||
anything that is normally distributed (in either source or binary
|
||||
form) with the major components (compiler, kernel, and so on) of the
|
||||
operating system on which the executable runs, unless that component
|
||||
itself accompanies the executable.
|
||||
|
||||
If distribution of executable or object code is made by offering
|
||||
access to copy from a designated place, then offering equivalent
|
||||
access to copy the source code from the same place counts as
|
||||
distribution of the source code, even though third parties are not
|
||||
compelled to copy the source along with the object code.
|
||||
|
||||
4. You may not copy, modify, sublicense, or distribute the Program
|
||||
except as expressly provided under this License. Any attempt
|
||||
otherwise to copy, modify, sublicense or distribute the Program is
|
||||
void, and will automatically terminate your rights under this License.
|
||||
However, parties who have received copies, or rights, from you under
|
||||
this License will not have their licenses terminated so long as such
|
||||
parties remain in full compliance.
|
||||
|
||||
5. You are not required to accept this License, since you have not
|
||||
signed it. However, nothing else grants you permission to modify or
|
||||
distribute the Program or its derivative works. These actions are
|
||||
prohibited by law if you do not accept this License. Therefore, by
|
||||
modifying or distributing the Program (or any work based on the
|
||||
Program), you indicate your acceptance of this License to do so, and
|
||||
all its terms and conditions for copying, distributing or modifying
|
||||
the Program or works based on it.
|
||||
|
||||
6. Each time you redistribute the Program (or any work based on the
|
||||
Program), the recipient automatically receives a license from the
|
||||
original licensor to copy, distribute or modify the Program subject to
|
||||
these terms and conditions. You may not impose any further
|
||||
restrictions on the recipients' exercise of the rights granted herein.
|
||||
You are not responsible for enforcing compliance by third parties to
|
||||
this License.
|
||||
|
||||
7. If, as a consequence of a court judgment or allegation of patent
|
||||
infringement or for any other reason (not limited to patent issues),
|
||||
conditions are imposed on you (whether by court order, agreement or
|
||||
otherwise) that contradict the conditions of this License, they do not
|
||||
excuse you from the conditions of this License. If you cannot
|
||||
distribute so as to satisfy simultaneously your obligations under this
|
||||
License and any other pertinent obligations, then as a consequence you
|
||||
may not distribute the Program at all. For example, if a patent
|
||||
license would not permit royalty-free redistribution of the Program by
|
||||
all those who receive copies directly or indirectly through you, then
|
||||
the only way you could satisfy both it and this License would be to
|
||||
refrain entirely from distribution of the Program.
|
||||
|
||||
If any portion of this section is held invalid or unenforceable under
|
||||
any particular circumstance, the balance of the section is intended to
|
||||
apply and the section as a whole is intended to apply in other
|
||||
circumstances.
|
||||
|
||||
It is not the purpose of this section to induce you to infringe any
|
||||
patents or other property right claims or to contest validity of any
|
||||
such claims; this section has the sole purpose of protecting the
|
||||
integrity of the free software distribution system, which is
|
||||
implemented by public license practices. Many people have made
|
||||
generous contributions to the wide range of software distributed
|
||||
through that system in reliance on consistent application of that
|
||||
system; it is up to the author/donor to decide if he or she is willing
|
||||
to distribute software through any other system and a licensee cannot
|
||||
impose that choice.
|
||||
|
||||
This section is intended to make thoroughly clear what is believed to
|
||||
be a consequence of the rest of this License.
|
||||
|
||||
8. If the distribution and/or use of the Program is restricted in
|
||||
certain countries either by patents or by copyrighted interfaces, the
|
||||
original copyright holder who places the Program under this License
|
||||
may add an explicit geographical distribution limitation excluding
|
||||
those countries, so that distribution is permitted only in or among
|
||||
countries not thus excluded. In such case, this License incorporates
|
||||
the limitation as if written in the body of this License.
|
||||
|
||||
9. The Free Software Foundation may publish revised and/or new versions
|
||||
of the General Public License from time to time. Such new versions will
|
||||
be similar in spirit to the present version, but may differ in detail to
|
||||
address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the Program
|
||||
specifies a version number of this License which applies to it and "any
|
||||
later version", you have the option of following the terms and conditions
|
||||
either of that version or of any later version published by the Free
|
||||
Software Foundation. If the Program does not specify a version number of
|
||||
this License, you may choose any version ever published by the Free Software
|
||||
Foundation.
|
||||
|
||||
10. If you wish to incorporate parts of the Program into other free
|
||||
programs whose distribution conditions are different, write to the author
|
||||
to ask for permission. For software which is copyrighted by the Free
|
||||
Software Foundation, write to the Free Software Foundation; we sometimes
|
||||
make exceptions for this. Our decision will be guided by the two goals
|
||||
of preserving the free status of all derivatives of our free software and
|
||||
of promoting the sharing and reuse of software generally.
|
||||
|
||||
NO WARRANTY
|
||||
|
||||
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
|
||||
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
|
||||
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
|
||||
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
|
||||
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
|
||||
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
|
||||
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
|
||||
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
|
||||
REPAIR OR CORRECTION.
|
||||
|
||||
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
|
||||
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
|
||||
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
|
||||
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
|
||||
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
|
||||
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
|
||||
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
|
||||
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
|
||||
POSSIBILITY OF SUCH DAMAGES.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
How to Apply These Terms to Your New Programs
|
||||
|
||||
If you develop a new program, and you want it to be of the greatest
|
||||
possible use to the public, the best way to achieve this is to make it
|
||||
free software which everyone can redistribute and change under these terms.
|
||||
|
||||
To do so, attach the following notices to the program. It is safest
|
||||
to attach them to the start of each source file to most effectively
|
||||
convey the exclusion of warranty; and each file should have at least
|
||||
the "copyright" line and a pointer to where the full notice is found.
|
||||
|
||||
<one line to give the program's name and a brief idea of what it does.>
|
||||
Copyright (C) <year> <name of author>
|
||||
|
||||
This program is free software; you can redistribute it and/or modify
|
||||
it under the terms of the GNU General Public License as published by
|
||||
the Free Software Foundation; either version 2 of the License, or
|
||||
(at your option) any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful,
|
||||
but WITHOUT ANY WARRANTY; without even the implied warranty of
|
||||
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
||||
GNU General Public License for more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License along
|
||||
with this program; if not, write to the Free Software Foundation, Inc.,
|
||||
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
|
||||
|
||||
Also add information on how to contact you by electronic and paper mail.
|
||||
|
||||
If the program is interactive, make it output a short notice like this
|
||||
when it starts in an interactive mode:
|
||||
|
||||
Gnomovision version 69, Copyright (C) year name of author
|
||||
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
|
||||
This is free software, and you are welcome to redistribute it
|
||||
under certain conditions; type `show c' for details.
|
||||
|
||||
The hypothetical commands `show w' and `show c' should show the appropriate
|
||||
parts of the General Public License. Of course, the commands you use may
|
||||
be called something other than `show w' and `show c'; they could even be
|
||||
mouse-clicks or menu items--whatever suits your program.
|
||||
|
||||
You should also get your employer (if you work as a programmer) or your
|
||||
school, if any, to sign a "copyright disclaimer" for the program, if
|
||||
necessary. Here is a sample; alter the names:
|
||||
|
||||
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
|
||||
`Gnomovision' (which makes passes at compilers) written by James Hacker.
|
||||
|
||||
<signature of Ty Coon>, 1 April 1989
|
||||
Ty Coon, President of Vice
|
||||
|
||||
This General Public License does not permit incorporating your program into
|
||||
proprietary programs. If your program is a subroutine library, you may
|
||||
consider it more useful to permit linking proprietary applications with the
|
||||
library. If this is what you want to do, use the GNU Lesser General
|
||||
Public License instead of this License.
|
||||
134
lib/unidecode/README
Normal file
134
lib/unidecode/README
Normal file
@@ -0,0 +1,134 @@
|
||||
Unidecode, lossy ASCII transliterations of Unicode text
|
||||
=======================================================
|
||||
|
||||
It often happens that you have text data in Unicode, but you need to
|
||||
represent it in ASCII. For example when integrating with legacy code that
|
||||
doesn't support Unicode, or for ease of entry of non-Roman names on a US
|
||||
keyboard, or when constructing ASCII machine identifiers from
|
||||
human-readable Unicode strings that should still be somewhat intelligeble
|
||||
(a popular example of this is when making an URL slug from an article
|
||||
title).
|
||||
|
||||
In most of these examples you could represent Unicode characters as
|
||||
"???" or "\\15BA\\15A0\\1610", to mention two extreme cases. But that's
|
||||
nearly useless to someone who actually wants to read what the text says.
|
||||
|
||||
What Unidecode provides is a middle road: function unidecode() takes
|
||||
Unicode data and tries to represent it in ASCII characters (i.e., the
|
||||
universally displayable characters between 0x00 and 0x7F), where the
|
||||
compromises taken when mapping between two character sets are chosen to be
|
||||
near what a human with a US keyboard would choose.
|
||||
|
||||
The quality of resulting ASCII representation varies. For languages of
|
||||
western origin it should be between perfect and good. On the other hand
|
||||
transliteration (i.e., conveying, in Roman letters, the pronunciation
|
||||
expressed by the text in some other writing system) of languages like
|
||||
Chinese, Japanese or Korean is a very complex issue and this library does
|
||||
not even attempt to address it. It draws the line at context-free
|
||||
character-by-character mapping. So a good rule of thumb is that the further
|
||||
the script you are transliterating is from Latin alphabet, the worse the
|
||||
transliteration will be.
|
||||
|
||||
Note that this module generally produces better results than simply
|
||||
stripping accents from characters (which can be done in Python with
|
||||
built-in functions). It is based on hand-tuned character mappings that for
|
||||
example also contain ASCII approximations for symbols and non-Latin
|
||||
alphabets.
|
||||
|
||||
This is a Python port of Text::Unidecode Perl module by
|
||||
Sean M. Burke <sburke@cpan.org>.
|
||||
|
||||
|
||||
Module content
|
||||
--------------
|
||||
|
||||
The module exports a single function that takes an Unicode object (Python
|
||||
2.x) or string (Python 3.x) and returns a string (that can be encoded to
|
||||
ASCII bytes in Python 3.x)::
|
||||
|
||||
>>> from unidecode import unidecode
|
||||
>>> unidecode(u'ko\u017eu\u0161\u010dek')
|
||||
'kozuscek'
|
||||
>>> unidecode(u'30 \U0001d5c4\U0001d5c6/\U0001d5c1')
|
||||
'30 km/h'
|
||||
>>> unidecode(u"\u5317\u4EB0")
|
||||
'Bei Jing '
|
||||
|
||||
|
||||
Requirements
|
||||
------------
|
||||
|
||||
Nothing except Python itself.
|
||||
|
||||
You need a Python build with "wide" Unicode characters (also called "UCS-4
|
||||
build") in order for unidecode to work correctly with characters outside of
|
||||
Basic Multilingual Plane (BMP). Common characters outside BMP are bold, italic,
|
||||
script, etc. variants of the Latin alphabet intended for mathematical notation.
|
||||
Surrogate pair encoding of "narrow" builds is not supported in unidecode.
|
||||
|
||||
If your Python build supports "wide" Unicode the following expression will
|
||||
return True::
|
||||
|
||||
>>> import sys
|
||||
>>> sys.maxunicode > 0xffff
|
||||
True
|
||||
|
||||
See PEP 261 for details regarding support for "wide" Unicode characters in
|
||||
Python.
|
||||
|
||||
|
||||
Installation
|
||||
------------
|
||||
|
||||
You install Unidecode, as you would install any Python module, by running
|
||||
these commands::
|
||||
|
||||
python setup.py install
|
||||
python setup.py test
|
||||
|
||||
|
||||
Source
|
||||
------
|
||||
|
||||
You can get the latest development version of Unidecode with::
|
||||
|
||||
git clone https://www.tablix.org/~avian/git/unidecode.git
|
||||
|
||||
|
||||
Support
|
||||
-------
|
||||
|
||||
Questions, bug reports, useful code bits, and suggestions for Unidecode
|
||||
should be sent to tomaz.solc@tablix.org
|
||||
|
||||
|
||||
Copyright
|
||||
---------
|
||||
|
||||
Original character transliteration tables:
|
||||
|
||||
Copyright 2001, Sean M. Burke <sburke@cpan.org>, all rights reserved.
|
||||
|
||||
Python code and later additions:
|
||||
|
||||
Copyright 2014, Tomaz Solc <tomaz.solc@tablix.org>
|
||||
|
||||
This program is free software; you can redistribute it and/or modify it
|
||||
under the terms of the GNU General Public License as published by the Free
|
||||
Software Foundation; either version 2 of the License, or (at your option)
|
||||
any later version.
|
||||
|
||||
This program is distributed in the hope that it will be useful, but WITHOUT
|
||||
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
|
||||
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
|
||||
more details.
|
||||
|
||||
You should have received a copy of the GNU General Public License along
|
||||
with this program; if not, write to the Free Software Foundation, Inc., 51
|
||||
Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. The programs and
|
||||
documentation in this dist are distributed in the hope that they will be
|
||||
useful, but without any warranty; without even the implied warranty of
|
||||
merchantability or fitness for a particular purpose.
|
||||
|
||||
..
|
||||
vim: set filetype=rst:
|
||||
@@ -1,4 +1,5 @@
|
||||
# -*- coding: utf-8 -*-
|
||||
# vi:tabstop=4:expandtab:sw=4
|
||||
"""Transliterate Unicode text into plain 7-bit ASCII.
|
||||
|
||||
Example usage:
|
||||
@@ -39,10 +40,15 @@ def unidecode(string):
|
||||
if codepoint < 0x80: # Basic ASCII
|
||||
retval.append(str(char))
|
||||
continue
|
||||
|
||||
|
||||
if codepoint > 0xeffff:
|
||||
continue # Characters in Private Use Area and above are ignored
|
||||
|
||||
if 0xd800 <= codepoint <= 0xdfff:
|
||||
warnings.warn( "Surrogate character %r will be ignored. "
|
||||
"You might be using a narrow Python build." % (char,),
|
||||
RuntimeWarning, 2)
|
||||
|
||||
section = codepoint >> 8 # Chop off the last two hex digits
|
||||
position = codepoint % 256 # Last two hex digits
|
||||
|
||||
@@ -50,7 +56,7 @@ def unidecode(string):
|
||||
table = Cache[section]
|
||||
except KeyError:
|
||||
try:
|
||||
mod = __import__('unidecode.x%03x'%(section), [], [], ['data'])
|
||||
mod = __import__('unidecode.x%03x'%(section), globals(), locals(), ['data'])
|
||||
except ImportError:
|
||||
Cache[section] = None
|
||||
continue # No match: ignore this character and carry on.
|
||||
|
||||
@@ -1,132 +1,15 @@
|
||||
data = (
|
||||
'\x00', # 0x00
|
||||
'\x01', # 0x01
|
||||
'\x02', # 0x02
|
||||
'\x03', # 0x03
|
||||
'\x04', # 0x04
|
||||
'\x05', # 0x05
|
||||
'\x06', # 0x06
|
||||
'\x07', # 0x07
|
||||
'\x08', # 0x08
|
||||
'\x09', # 0x09
|
||||
'\x0a', # 0x0a
|
||||
'\x0b', # 0x0b
|
||||
'\x0c', # 0x0c
|
||||
'\x0d', # 0x0d
|
||||
'\x0e', # 0x0e
|
||||
'\x0f', # 0x0f
|
||||
'\x10', # 0x10
|
||||
'\x11', # 0x11
|
||||
'\x12', # 0x12
|
||||
'\x13', # 0x13
|
||||
'\x14', # 0x14
|
||||
'\x15', # 0x15
|
||||
'\x16', # 0x16
|
||||
'\x17', # 0x17
|
||||
'\x18', # 0x18
|
||||
'\x19', # 0x19
|
||||
'\x1a', # 0x1a
|
||||
'\x1b', # 0x1b
|
||||
'\x1c', # 0x1c
|
||||
'\x1d', # 0x1d
|
||||
'\x1e', # 0x1e
|
||||
'\x1f', # 0x1f
|
||||
' ', # 0x20
|
||||
'!', # 0x21
|
||||
'"', # 0x22
|
||||
'#', # 0x23
|
||||
'$', # 0x24
|
||||
'%', # 0x25
|
||||
'&', # 0x26
|
||||
'\'', # 0x27
|
||||
'(', # 0x28
|
||||
')', # 0x29
|
||||
'*', # 0x2a
|
||||
'+', # 0x2b
|
||||
',', # 0x2c
|
||||
'-', # 0x2d
|
||||
'.', # 0x2e
|
||||
'/', # 0x2f
|
||||
'0', # 0x30
|
||||
'1', # 0x31
|
||||
'2', # 0x32
|
||||
'3', # 0x33
|
||||
'4', # 0x34
|
||||
'5', # 0x35
|
||||
'6', # 0x36
|
||||
'7', # 0x37
|
||||
'8', # 0x38
|
||||
'9', # 0x39
|
||||
':', # 0x3a
|
||||
';', # 0x3b
|
||||
'<', # 0x3c
|
||||
'=', # 0x3d
|
||||
'>', # 0x3e
|
||||
'?', # 0x3f
|
||||
'@', # 0x40
|
||||
'A', # 0x41
|
||||
'B', # 0x42
|
||||
'C', # 0x43
|
||||
'D', # 0x44
|
||||
'E', # 0x45
|
||||
'F', # 0x46
|
||||
'G', # 0x47
|
||||
'H', # 0x48
|
||||
'I', # 0x49
|
||||
'J', # 0x4a
|
||||
'K', # 0x4b
|
||||
'L', # 0x4c
|
||||
'M', # 0x4d
|
||||
'N', # 0x4e
|
||||
'O', # 0x4f
|
||||
'P', # 0x50
|
||||
'Q', # 0x51
|
||||
'R', # 0x52
|
||||
'S', # 0x53
|
||||
'T', # 0x54
|
||||
'U', # 0x55
|
||||
'V', # 0x56
|
||||
'W', # 0x57
|
||||
'X', # 0x58
|
||||
'Y', # 0x59
|
||||
'Z', # 0x5a
|
||||
']', # 0x5b
|
||||
'\\', # 0x5c
|
||||
']', # 0x5d
|
||||
'^', # 0x5e
|
||||
'_', # 0x5f
|
||||
'`', # 0x60
|
||||
'a', # 0x61
|
||||
'b', # 0x62
|
||||
'c', # 0x63
|
||||
'd', # 0x64
|
||||
'e', # 0x65
|
||||
'f', # 0x66
|
||||
'g', # 0x67
|
||||
'h', # 0x68
|
||||
'i', # 0x69
|
||||
'j', # 0x6a
|
||||
'k', # 0x6b
|
||||
'l', # 0x6c
|
||||
'm', # 0x6d
|
||||
'n', # 0x6e
|
||||
'o', # 0x6f
|
||||
'p', # 0x70
|
||||
'q', # 0x71
|
||||
'r', # 0x72
|
||||
's', # 0x73
|
||||
't', # 0x74
|
||||
'u', # 0x75
|
||||
'v', # 0x76
|
||||
'w', # 0x77
|
||||
'x', # 0x78
|
||||
'y', # 0x79
|
||||
'z', # 0x7a
|
||||
'{', # 0x7b
|
||||
'|', # 0x7c
|
||||
'}', # 0x7d
|
||||
'~', # 0x7e
|
||||
'', # 0x7f
|
||||
# Code points u+007f and below are equivalent to ASCII and are handled by a
|
||||
# special case in the code. Hence they are not present in this table.
|
||||
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
|
||||
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
|
||||
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
|
||||
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
|
||||
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
|
||||
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
|
||||
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
|
||||
'', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
|
||||
|
||||
'', # 0x80
|
||||
'', # 0x81
|
||||
'', # 0x82
|
||||
@@ -162,7 +45,10 @@ data = (
|
||||
' ', # 0xa0
|
||||
'!', # 0xa1
|
||||
'C/', # 0xa2
|
||||
|
||||
# Not "GBP" - Pound Sign is used for more than just British Pounds.
|
||||
'PS', # 0xa3
|
||||
|
||||
'$?', # 0xa4
|
||||
'Y=', # 0xa5
|
||||
'|', # 0xa6
|
||||
@@ -177,8 +63,11 @@ data = (
|
||||
'-', # 0xaf
|
||||
'deg', # 0xb0
|
||||
'+-', # 0xb1
|
||||
|
||||
# These might be combined with other superscript digits (u+2070 - u+2079)
|
||||
'2', # 0xb2
|
||||
'3', # 0xb3
|
||||
|
||||
'\'', # 0xb4
|
||||
'u', # 0xb5
|
||||
'P', # 0xb6
|
||||
@@ -195,7 +84,10 @@ data = (
|
||||
'A', # 0xc1
|
||||
'A', # 0xc2
|
||||
'A', # 0xc3
|
||||
|
||||
# Not "AE" - used in languages other than German
|
||||
'A', # 0xc4
|
||||
|
||||
'A', # 0xc5
|
||||
'AE', # 0xc6
|
||||
'C', # 0xc7
|
||||
@@ -213,13 +105,19 @@ data = (
|
||||
'O', # 0xd3
|
||||
'O', # 0xd4
|
||||
'O', # 0xd5
|
||||
|
||||
# Not "OE" - used in languages other than German
|
||||
'O', # 0xd6
|
||||
|
||||
'x', # 0xd7
|
||||
'O', # 0xd8
|
||||
'U', # 0xd9
|
||||
'U', # 0xda
|
||||
'U', # 0xdb
|
||||
|
||||
# Not "UE" - used in languages other than German
|
||||
'U', # 0xdc
|
||||
|
||||
'Y', # 0xdd
|
||||
'Th', # 0xde
|
||||
'ss', # 0xdf
|
||||
@@ -227,7 +125,10 @@ data = (
|
||||
'a', # 0xe1
|
||||
'a', # 0xe2
|
||||
'a', # 0xe3
|
||||
|
||||
# Not "ae" - used in languages other than German
|
||||
'a', # 0xe4
|
||||
|
||||
'a', # 0xe5
|
||||
'ae', # 0xe6
|
||||
'c', # 0xe7
|
||||
@@ -245,13 +146,19 @@ data = (
|
||||
'o', # 0xf3
|
||||
'o', # 0xf4
|
||||
'o', # 0xf5
|
||||
|
||||
# Not "oe" - used in languages other than German
|
||||
'o', # 0xf6
|
||||
|
||||
'/', # 0xf7
|
||||
'o', # 0xf8
|
||||
'u', # 0xf9
|
||||
'u', # 0xfa
|
||||
'u', # 0xfb
|
||||
|
||||
# Not "ue" - used in languages other than German
|
||||
'u', # 0xfc
|
||||
|
||||
'y', # 0xfd
|
||||
'th', # 0xfe
|
||||
'y', # 0xff
|
||||
|
||||
@@ -171,7 +171,7 @@ data = (
|
||||
'W', # 0xa9
|
||||
'NS', # 0xaa
|
||||
'D', # 0xab
|
||||
'EU', # 0xac
|
||||
'EUR', # 0xac
|
||||
'K', # 0xad
|
||||
'T', # 0xae
|
||||
'Dr', # 0xaf
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
data = (
|
||||
'', # 0x00
|
||||
'', # 0x01
|
||||
'', # 0x02
|
||||
'C', # 0x02
|
||||
'', # 0x03
|
||||
'', # 0x04
|
||||
'', # 0x05
|
||||
@@ -12,7 +12,7 @@ data = (
|
||||
'', # 0x0a
|
||||
'', # 0x0b
|
||||
'', # 0x0c
|
||||
'', # 0x0d
|
||||
'H', # 0x0d
|
||||
'', # 0x0e
|
||||
'', # 0x0f
|
||||
'', # 0x10
|
||||
@@ -20,22 +20,22 @@ data = (
|
||||
'', # 0x12
|
||||
'', # 0x13
|
||||
'', # 0x14
|
||||
'', # 0x15
|
||||
'N', # 0x15
|
||||
'', # 0x16
|
||||
'', # 0x17
|
||||
'', # 0x18
|
||||
'', # 0x19
|
||||
'', # 0x1a
|
||||
'P', # 0x19
|
||||
'Q', # 0x1a
|
||||
'', # 0x1b
|
||||
'', # 0x1c
|
||||
'', # 0x1d
|
||||
'R', # 0x1d
|
||||
'', # 0x1e
|
||||
'', # 0x1f
|
||||
'(sm)', # 0x20
|
||||
'TEL', # 0x21
|
||||
'(tm)', # 0x22
|
||||
'', # 0x23
|
||||
'', # 0x24
|
||||
'Z', # 0x24
|
||||
'', # 0x25
|
||||
'', # 0x26
|
||||
'', # 0x27
|
||||
@@ -45,12 +45,12 @@ data = (
|
||||
'A', # 0x2b
|
||||
'', # 0x2c
|
||||
'', # 0x2d
|
||||
'', # 0x2e
|
||||
'', # 0x2f
|
||||
'', # 0x30
|
||||
'', # 0x31
|
||||
'e', # 0x2e
|
||||
'e', # 0x2f
|
||||
'E', # 0x30
|
||||
'F', # 0x31
|
||||
'F', # 0x32
|
||||
'', # 0x33
|
||||
'M', # 0x33
|
||||
'', # 0x34
|
||||
'', # 0x35
|
||||
'', # 0x36
|
||||
@@ -59,20 +59,20 @@ data = (
|
||||
'', # 0x39
|
||||
'', # 0x3a
|
||||
'FAX', # 0x3b
|
||||
'[?]', # 0x3c
|
||||
'[?]', # 0x3d
|
||||
'[?]', # 0x3e
|
||||
'[?]', # 0x3f
|
||||
'', # 0x3c
|
||||
'', # 0x3d
|
||||
'', # 0x3e
|
||||
'', # 0x3f
|
||||
'[?]', # 0x40
|
||||
'[?]', # 0x41
|
||||
'[?]', # 0x42
|
||||
'[?]', # 0x43
|
||||
'[?]', # 0x44
|
||||
'[?]', # 0x45
|
||||
'[?]', # 0x46
|
||||
'[?]', # 0x47
|
||||
'[?]', # 0x48
|
||||
'[?]', # 0x49
|
||||
'D', # 0x45
|
||||
'd', # 0x46
|
||||
'e', # 0x47
|
||||
'i', # 0x48
|
||||
'j', # 0x49
|
||||
'[?]', # 0x4a
|
||||
'[?]', # 0x4b
|
||||
'[?]', # 0x4c
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
data = (
|
||||
'[?] ', # 0x00
|
||||
'Yi ', # 0x00
|
||||
'Ding ', # 0x01
|
||||
'Kao ', # 0x02
|
||||
'Qi ', # 0x03
|
||||
|
||||
Reference in New Issue
Block a user