diff options
Diffstat (limited to 'moon-abe/pbc-0.5.14/doc/internal.txt')
-rw-r--r-- | moon-abe/pbc-0.5.14/doc/internal.txt | 428 |
1 files changed, 428 insertions, 0 deletions
diff --git a/moon-abe/pbc-0.5.14/doc/internal.txt b/moon-abe/pbc-0.5.14/doc/internal.txt new file mode 100644 index 00000000..b2f217e3 --- /dev/null +++ b/moon-abe/pbc-0.5.14/doc/internal.txt @@ -0,0 +1,428 @@ +== PBC internals == + +The source code is organized by subdirectories: + +*`include`*: Headers describing the official API. Headers in other places +are for internal use only. + +*`arith`*: Finite fields: modular arithmetic, polynomial rings, and polynomial +rings modulo a polynomial. Finite fields of low characteristic are unsupported. + +*`ecc`*: Elliptic curve generation, elliptic curve groups and pairings. One +source file is dedicated to each type of pairing, containing specialized +optimizations. Some of the code requires arbitrary precision complex numbers, +which also live here but should be moved elsewhere one day. + +*`misc`*: Dynamic arrays, symbol tables, benchmarking, logging, debugging, +other utilities. + +*`gen`*: Programs that generate pairing parameters and list Hilbert +polynomials. These were used to prepare the samples in the `param` directory. + +*`example`*: Example programs showing how to use the library. + +*`guru`*: Tests, experimental code. + +=== Groups, rings, fields === + +Algebraic structures are represented in the +field_t+ data type, which mostly +contains pointers to functions written to perform operations such as addition +and multiplication in that particular group, ring or field: + + struct field_s { + ... + void (*init)(element_ptr); + void (*clear)(element_ptr); + ... + void (*add)(element_ptr, element_ptr, element_ptr); + void (*sub)(element_ptr, element_ptr, element_ptr); + void (*mul)(element_ptr, element_ptr, element_ptr); + ... + }; + typedef struct field_s *field_ptr; + typedef struct field_s field_t[1]; + +The name +algebraic_structure_t+ is arguably more accurate, but far too +cumbersome. It may help if one views groups and rings as handicapped fields. + +The last two lines of the above code excerpt show how GMP and PBC define data +types: they are arrays of length one so that when a variable is +declared, space is automatically allocated for it on the stack. +Yet when used as a argument to a function, a pointer is passed, thus there is +no need to explicitly allocate and deallocate memory, nor reference and +dereference variables. + +Each +element_t+ contains a field named +field+ to such a +field_t+ variable. +The only other field is +data+, which stores any data needed for the +implementation of the particular algebraic structure the element resides in. + + struct element_s { + struct field_s *field; + void *data; + }; + +When an +element_t+ variable is initialized, +field+ is set appropriately, and +then the initialization specific to that field is called to complete the +initialization. Here, a line of code is worth a thousand words: + + void element_init(element_t e, field_ptr f) { + e->field = f; + f->init(e); + } + +Thus during a call to one of the `element_` functions, the +field+ pointer is +followed then the appropriate routine is executed. For example, modular addition +results when the input element is an element of a finite field, while +polynomial addition is performed for elements of a polynomial ring and so on. + + void element_add(element_t n, element_t a, element_t b) { + n->field->add(n, a, b); + } + +My design may seem dangerous because if a programmer inadvertently attempts +to add a polynomial and a point on an elliptic curve, say, the code +will compile without warnings since they have the same data type. + +However I settled on having a catch-all ``glorified +void *+'' +element_t+ +because I wanted to + +- extend a field an arbitrary number of times (though in practice, currently I + only need to extend a field twice at most), +- switch fields easily, so for example a program that benchmarks addition in + polynomial rings can be trivially modified to benchmark addition in a group, + and +- interchange different implementations of the same algebraic structure, for + example, compare Montgomery representation versus a naive implementation of + integer modulo rings. + +Additionally, defining `PBC_DEBUG` catches many type mismatches. + +In mathematics, groups, rings and fields should be distinguished, but for +implmentation, it is simplest lump them together under the same heading. +In any event, distinct data types may lead to a false sense of security. +Fields of prime order with different moduli would still fall under the same +data type, with unpleasant results if their elements are mistakenly mixed. + +I have vague plans to add flags to +field_t+ describing the capabilities of a +particular +field_t+. These flags would be set during initialization, and +would indicate for example whether one can invert every nonzero element, +whether there are one or two operations (that is, group versus ring), whether +the field is an integer mod ring, polynomial ring, or polynomial mod ring, and +so on. Once in place, more runtime checks can be performed to avoid illegal +inversion and similar problems. + +Another option is to introduce data types for each of the four pairing-related +algebraic structures, namely G1, G2, GT and Zr, as these are the only ones +needed for implementing pairing-based cryptosystems. + +An alternative was to simply use +void *+ instead of +element_t+ and require +the programmer to pass the field as a parameter, e.g. +element_add(a, b, c, +F_13)+, but I decided the added annoyance of having to type this extra variable +every time negated any benefits, such as obviating the need for the ++field+ pointer in +struct element_s+, even if one ignores +the more serious problem that runtime type checking is considerably harder, if +not impossible. + +I suppose one could write a preprocessor to convert one type of notation +to the other, but I would like the code to be standard C. (On the other hand, +as Hovav Shacham suggested, it may be nice to eventually have a converter that +takes human-friendly infix operator expressions like `a = (b + c) * +d` and outputs the assembly-like `element_` equivalents.) + +=== Internal randomness === + +Some algorithms require a quadratic nonresidue in a given field. These +are computed lazily: The first time a quadratic nonresidue is requested, one is +generated at random, using the same source of random bits as other PBC random +functions. [Which reminds me, should I get rid of the +nqr+ field and instead +have it as part of the +data+ field in struct field_s?] + +In `fieldquadratic.c`, a quadratic field extension is constructed with a square +root of this randomly generated quadratic nonresidue in the base field. Thus +for a nondeterminstic source of random bits, the same field may be constructed +differently on different runs. + +To construct the same field the same way every time, one must record the +quadratic nonresidue generated from one run, and call `field_set_nqr()` every +time this particular construction of a quadratic field extension is desired. +Another use for this function is to save time by setting the quadratic +nonresidue to some precomputed value. + +Similarly, for higher degree extensions, a random irreducible polynomial +may be chosen to construct it, but this must be recorded if the same +construction is later required. + +This happens behind the scenes in PBC. + +=== Type A internals === + +Type A pairings are constructed on the curve y^2^ = x^3^ + x over the field F_q +for some prime q = 3 mod 4. +Both G1 and G2 are the group of points E(F_q), so this +pairing is symmetric. It turns out #E(F_q) = q + 1 and +#E(F_q^2^) = (q + 1)^2^. Thus the embedding degree k is 2, +and hence GT is a subgroup of F_q^2. The order r is some prime +factor of q + 1. + +Write q + 1 = r * h. For efficiency, r is picked to be a Solinas prime, +that is, r has the form 2^a^ +- 2^b^ +- 1 for some integers 0 < b < a. + +Also, we choose q = -1 mod 12 so F_q^2^ can be implemented as F_q[i] +(where i = sqrt(-1)) and since q = -1 mod 3, cube roots in F_q +are easy to compute. This latter feature may be removed because I have +not found a use for it yet (in which case we only need q = -1 mod 4). + ++a_param+ struct fields: + + exp2, exp1, sign1, sign0, r: + r = 2^exp2 + sign1 * 2^exp1 + sign0 * 1 (Solinas prime) + q, h: + r * h = q + 1 + q is a prime, h is a multiple of 12 (thus q = -1 mod 12) + +Type A1 uses the same equation, but have different fields since the library +is given r and cannot choose it. + ++a1_param+ struct fields: + + p, n, l: + p + 1 = n * l + p is prime, same as the q in a_param, n is the order of the group. + +=== Type B internals === + +Unimplemented. Similar to type A. The curve y^2^ = x^3^ + 1 over the field F_q +for some prime q = 2 mod 3, which implies cube roots in F_q are easy to +compute, though we can achieve this for type A pairings by constraining q +appropriately. I recommend requiring q = 3 mod 4 as well, so that -1 is +a quadratic nonresidue. + +The lack of an x term simplifies some routines such as point doubling. + +It turns out we must choose between symmetry or efficiency due to the nature of +a certain optimization. + +=== Type C internals === + +Unimplemented. The supersingular curves y^2^ = x^3^ + 2x + 1 and +y^2^ = x^3^ + 2x - 1 over a field of characteristic 3. Discussed at length +by Boneh, Lynn, and Shacham, "Short signatures from the Weil pairing". +Many optimizations can be applied to speed up these pairings; see +Barreto et al., "Efficient algorithms for pairing-based cryptosystems", but +sadly, an attack due to Coppersmith makes these curves less attractive. + +=== Type D internals === + +These are ordinary curves of with embedding degree 6, whose orders are prime +or a prime multiplied by a small constant. + +A type D curve is defined over some field F_q and has order h * r where +r is a prime and h is a small constant. Over the field F_q^6^ its order is +a multiple of r^2^. + +Typically the order of the curve E is around 170 bits, as is F_q, the base +field, thus q^k^ is around the 1024-bit mark which is commonly considered +good enough. + ++d_param+ struct fields: + + q F_q is the base field + n # of points in E(F_q) + r large prime dividing n + h n = h * r + a E: y^2 = x^3 + ax + b + b + nk # of points in E(F_q^k) + hk nk = hk * r * r + coeff0 coefficients of a monic cubic irreducible over F_q + coeff1 + coeff2 + nqr quadratic nonresidue in F_q + +These were discovered by Miyaji, Nakabayashi and Takano, +"New explicit conditions of elliptic curve traces for FR-reduction". + +=== Type E Internals === + +The CM (Complex Multiplication) method of constructing elliptic curves +starts with the Diophantine equation + + DV^2 = 4q - t^2 + +If t = 2 and q = D r^2^ h^2^ + 1 for some prime r (which we choose to +be a Solinas prime) and some integer h, we find that this equation is easily +solved with V = 2rh. + +Thus it is easy to find a curve (over the field F_q) with order q - 1. +Note r^2^ divides q - 1, thus we have an embedding degree of 1. + +Hence all computations necessary for the pairing can be done in F_q alone. +There is never any need to extend F_q. + +As q is typically 1024 bits, group elements take a lot of space to represent. +Moreover, many optimizations do not apply to this type, resulting in a slower +pairing. + ++e_param+ struct fields: + + exp2, exp1, sign1, sign0, r: + r = 2^exp2 + sign1 * 2^exp1 + sign0 * 1 (Solinas prime) + q, h + q = h r^2 + 1 where r is prime, and h is 28 times a perfect square + a, b + E: y^2 = x^3 + ax + b + +=== Type F internals === + +Using carefully crafted polynomials, k = 12 pairings can be constructed. +Only 160 bits are needed to represent elements of one group, and 320 bits +for the other. + +Also, embedding degree k = 12 allows higher security short signatures. +(k = 6 curves cannot +be used to scale security from 160-bits to say 256-bits because finite +field attacks are subexponential.) + ++f_param+ struct fields: + + q: + The curve is defined over Fq + r: + The order of the curve. + b: + E: y^2= x^3 + b + beta: + A quadratic nonresidue in Fq: used in quadratic extension. + alpha0, alpha1: + x^6 + alpha0 + alpha1 sqrt(beta) is irreducible: used in sextic extension. + +Discovered by Barreto and Naehrig, "Pairing-friendly elliptic curves of prime order". + +=== Type G Internals === + +Another construction based on the CM method. + ++g_param+ struct fields: + + q, n, h, r: + h * r = n is the order of E(F_q) + a, b: + E: y^2 = x^3 + ax + b + nk: + #E(F_q^10) + hk: + hk * r^2 = nk + coeff: + array of coefficients of polynomial used for quintic extension. + nqr: + a quadratic nonresidue + ++g_param+ struct fields: + +Discovered by Freeman, "Constructing pairing-friendly elliptic curves with embedding degree 10." + +=== Type I Internals === + +Type I pairings is symmetric, constructed on a supersingular curve +y^2^ = x^3^ - x + 1 over a ternary extension field F_{3^m^}. +The embedding degree k is 6. +Both G1 and G2 are the group of points E(F_{3^m^}). +GT is a subgroup of F_{3^6*m^}. The group order is a prime number. + +parameters: + + m, t: + The ternary extension field is F(3)[x]/(x^m^ + x^t^ + 2). + n: + the order of G1 + n2: + n * n2 = number of points in E(F_{3^m^}) + +Introduced by Barreto et al, "Efficient Pairing Computation on Supersingular +Abelian Varieties", Designs, Codes and Cryptography, vol. 42, no. 3, pp. 239-271, +Mar. 2007. + +=== Testing functions === + +For testing, debugging, demonstrations and benchmarks. +Declared in +pbc_test.h+: + +include::gen/test.txt[] + +=== Dynamic arrays === + +The +darray_t+ data type manages an array of pointers of type +void \*+, +allocating more memory when necessary. +Declared in +pbc_darray.h+. + +include::gen/darray.txt[] + +=== Symbol tables === + +The +symtab_t+ data type manages symbol tables where the keys are strings of +type +char \*+ and the values are pointers of type +void \*+. + +At present, they are implemented inefficiently using dynamic arrays, but this +will change if the need arises. They are only used when reading a +pbc_param_t+ +from a string. Declared in +pbc_symtab.h+. + +include::gen/symtab.txt[] + +=== Religious stances === + +I chose C because: + +- GMP, which PBC requires and is also modeled on, is also written in C. +- PBC is intended to be a low-level portable cryptographic library. C is the + least common denominator. It should not be difficult to wrap PBC for other + languages. +- Despite its drawbacks (I would appreciate operator overloading and + genericity, and to a lesser extent garbage collection), I've found few + languages I like better. To quote Rob Pike, C is the desert island language. + (I also agree with his statement that OO languages conceptually provide + little extra over judicious use of function pointers in C.) + +With respect to indentation, I'm migrating the code to follow +http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml[Google C++ +Style Guide] to avoid having to switch styles all the time. +The code was originally written using my old style: 4-space indent with 1TBS +(One True Brace Style). + +I'd like to have no library dependencies (except standard C libraries), +but then I'd have to write a large integer library. Furthermore, I'd have to +write it in assembly, and then port it. + +To avoid this, I use an existing library. I selected GMP because the library's +focus is on multiprecision arithmetic and nothing else, and it aims to be as +fast as possible on many platforms. Another important factor is that GMP is +released under a free license. + +On the other hand, GMP is written to deal with extremely large numbers, while I +mostly only need integers that are roughly between 160 and 2048 bits. It is +possible a library specializing in numbers of these sizes would be better for +PBC. + +I'm fond of GMP's method for eliminating the need for the +&+ and +*+ +operators most of the time by declaring a typedef on arrays of size 1. I try +to do the same with PBC for consistency, though this trick does have drawbacks. + +I would like to have GMP as the only library dependency, though I do not mind +using other libraries so long as they are optional. For example, one of the +test programs is much easier to use if compiled with the GNU readline library, +but by default compiles without it and is still functional. + +I dislike the C preprocessor. I like to place platform-specific code in +separate files and let the build system work out which one to use. Integer +constants can be defined with enum instead. I intend to minimize the number of ++#include+ statements in header files for PBC's internal use as much as +possible (they should be in the `.c` files instead), and later perhaps even +remove those annoying +#ifndef+ statements too. +I grudgingly accept some macros for PBC's debugging features. + +I liberally use nested functions, a GNU C extension. I find their expressiveness so indispensable that I'm willing to sacrifice portability for them. + +The +http://www.gnu.org/software/libc/manual/html_node/Reserved-Names.html[GNU libc manual] +states that data types ending in +_t+ should not be used because they are +reserved for future additions to C or POSIX. On the other hand, I want to stay +consistent with GMP, and ending data types with +_t+ is common practice. |