summaryrefslogtreecommitdiffstats
path: root/moon-abe/pbc-0.5.14/doc/internal.txt
diff options
context:
space:
mode:
authorRuan HE <ruan.he@orange.com>2015-09-04 07:35:06 +0000
committerGerrit Code Review <gerrit@172.30.200.206>2015-09-04 07:35:06 +0000
commitca6aa8198d2335f8c326c3dd4d26bf5899064214 (patch)
tree6274a2d971fc0cac0896efe8583927d0190e3d20 /moon-abe/pbc-0.5.14/doc/internal.txt
parent92fd2dbfb672d7b2b1cdfd5dd5cf89f7716b3e12 (diff)
parent3baeb11a8fbcfcdbc31976d421f17b85503b3ecd (diff)
Merge "init attribute-based encryption"
Diffstat (limited to 'moon-abe/pbc-0.5.14/doc/internal.txt')
-rw-r--r--moon-abe/pbc-0.5.14/doc/internal.txt428
1 files changed, 428 insertions, 0 deletions
diff --git a/moon-abe/pbc-0.5.14/doc/internal.txt b/moon-abe/pbc-0.5.14/doc/internal.txt
new file mode 100644
index 00000000..b2f217e3
--- /dev/null
+++ b/moon-abe/pbc-0.5.14/doc/internal.txt
@@ -0,0 +1,428 @@
+== PBC internals ==
+
+The source code is organized by subdirectories:
+
+*`include`*: Headers describing the official API. Headers in other places
+are for internal use only.
+
+*`arith`*: Finite fields: modular arithmetic, polynomial rings, and polynomial
+rings modulo a polynomial. Finite fields of low characteristic are unsupported.
+
+*`ecc`*: Elliptic curve generation, elliptic curve groups and pairings. One
+source file is dedicated to each type of pairing, containing specialized
+optimizations. Some of the code requires arbitrary precision complex numbers,
+which also live here but should be moved elsewhere one day.
+
+*`misc`*: Dynamic arrays, symbol tables, benchmarking, logging, debugging,
+other utilities.
+
+*`gen`*: Programs that generate pairing parameters and list Hilbert
+polynomials. These were used to prepare the samples in the `param` directory.
+
+*`example`*: Example programs showing how to use the library.
+
+*`guru`*: Tests, experimental code.
+
+=== Groups, rings, fields ===
+
+Algebraic structures are represented in the +field_t+ data type, which mostly
+contains pointers to functions written to perform operations such as addition
+and multiplication in that particular group, ring or field:
+
+ struct field_s {
+ ...
+ void (*init)(element_ptr);
+ void (*clear)(element_ptr);
+ ...
+ void (*add)(element_ptr, element_ptr, element_ptr);
+ void (*sub)(element_ptr, element_ptr, element_ptr);
+ void (*mul)(element_ptr, element_ptr, element_ptr);
+ ...
+ };
+ typedef struct field_s *field_ptr;
+ typedef struct field_s field_t[1];
+
+The name +algebraic_structure_t+ is arguably more accurate, but far too
+cumbersome. It may help if one views groups and rings as handicapped fields.
+
+The last two lines of the above code excerpt show how GMP and PBC define data
+types: they are arrays of length one so that when a variable is
+declared, space is automatically allocated for it on the stack.
+Yet when used as a argument to a function, a pointer is passed, thus there is
+no need to explicitly allocate and deallocate memory, nor reference and
+dereference variables.
+
+Each +element_t+ contains a field named +field+ to such a +field_t+ variable.
+The only other field is +data+, which stores any data needed for the
+implementation of the particular algebraic structure the element resides in.
+
+ struct element_s {
+ struct field_s *field;
+ void *data;
+ };
+
+When an +element_t+ variable is initialized, +field+ is set appropriately, and
+then the initialization specific to that field is called to complete the
+initialization. Here, a line of code is worth a thousand words:
+
+ void element_init(element_t e, field_ptr f) {
+ e->field = f;
+ f->init(e);
+ }
+
+Thus during a call to one of the `element_` functions, the +field+ pointer is
+followed then the appropriate routine is executed. For example, modular addition
+results when the input element is an element of a finite field, while
+polynomial addition is performed for elements of a polynomial ring and so on.
+
+ void element_add(element_t n, element_t a, element_t b) {
+ n->field->add(n, a, b);
+ }
+
+My design may seem dangerous because if a programmer inadvertently attempts
+to add a polynomial and a point on an elliptic curve, say, the code
+will compile without warnings since they have the same data type.
+
+However I settled on having a catch-all ``glorified +void *+'' +element_t+
+because I wanted to
+
+- extend a field an arbitrary number of times (though in practice, currently I
+ only need to extend a field twice at most),
+- switch fields easily, so for example a program that benchmarks addition in
+ polynomial rings can be trivially modified to benchmark addition in a group,
+ and
+- interchange different implementations of the same algebraic structure, for
+ example, compare Montgomery representation versus a naive implementation of
+ integer modulo rings.
+
+Additionally, defining `PBC_DEBUG` catches many type mismatches.
+
+In mathematics, groups, rings and fields should be distinguished, but for
+implmentation, it is simplest lump them together under the same heading.
+In any event, distinct data types may lead to a false sense of security.
+Fields of prime order with different moduli would still fall under the same
+data type, with unpleasant results if their elements are mistakenly mixed.
+
+I have vague plans to add flags to +field_t+ describing the capabilities of a
+particular +field_t+. These flags would be set during initialization, and
+would indicate for example whether one can invert every nonzero element,
+whether there are one or two operations (that is, group versus ring), whether
+the field is an integer mod ring, polynomial ring, or polynomial mod ring, and
+so on. Once in place, more runtime checks can be performed to avoid illegal
+inversion and similar problems.
+
+Another option is to introduce data types for each of the four pairing-related
+algebraic structures, namely G1, G2, GT and Zr, as these are the only ones
+needed for implementing pairing-based cryptosystems.
+
+An alternative was to simply use +void *+ instead of +element_t+ and require
+the programmer to pass the field as a parameter, e.g. +element_add(a, b, c,
+F_13)+, but I decided the added annoyance of having to type this extra variable
+every time negated any benefits, such as obviating the need for the
++field+ pointer in +struct element_s+, even if one ignores
+the more serious problem that runtime type checking is considerably harder, if
+not impossible.
+
+I suppose one could write a preprocessor to convert one type of notation
+to the other, but I would like the code to be standard C. (On the other hand,
+as Hovav Shacham suggested, it may be nice to eventually have a converter that
+takes human-friendly infix operator expressions like `a = (b + c) *
+d` and outputs the assembly-like `element_` equivalents.)
+
+=== Internal randomness ===
+
+Some algorithms require a quadratic nonresidue in a given field. These
+are computed lazily: The first time a quadratic nonresidue is requested, one is
+generated at random, using the same source of random bits as other PBC random
+functions. [Which reminds me, should I get rid of the +nqr+ field and instead
+have it as part of the +data+ field in struct field_s?]
+
+In `fieldquadratic.c`, a quadratic field extension is constructed with a square
+root of this randomly generated quadratic nonresidue in the base field. Thus
+for a nondeterminstic source of random bits, the same field may be constructed
+differently on different runs.
+
+To construct the same field the same way every time, one must record the
+quadratic nonresidue generated from one run, and call `field_set_nqr()` every
+time this particular construction of a quadratic field extension is desired.
+Another use for this function is to save time by setting the quadratic
+nonresidue to some precomputed value.
+
+Similarly, for higher degree extensions, a random irreducible polynomial
+may be chosen to construct it, but this must be recorded if the same
+construction is later required.
+
+This happens behind the scenes in PBC.
+
+=== Type A internals ===
+
+Type A pairings are constructed on the curve y^2^ = x^3^ + x over the field F_q
+for some prime q = 3 mod 4.
+Both G1 and G2 are the group of points E(F_q), so this
+pairing is symmetric. It turns out #E(F_q) = q + 1 and
+#E(F_q^2^) = (q + 1)^2^. Thus the embedding degree k is 2,
+and hence GT is a subgroup of F_q^2. The order r is some prime
+factor of q + 1.
+
+Write q + 1 = r * h. For efficiency, r is picked to be a Solinas prime,
+that is, r has the form 2^a^ +- 2^b^ +- 1 for some integers 0 < b < a.
+
+Also, we choose q = -1 mod 12 so F_q^2^ can be implemented as F_q[i]
+(where i = sqrt(-1)) and since q = -1 mod 3, cube roots in F_q
+are easy to compute. This latter feature may be removed because I have
+not found a use for it yet (in which case we only need q = -1 mod 4).
+
++a_param+ struct fields:
+
+ exp2, exp1, sign1, sign0, r:
+ r = 2^exp2 + sign1 * 2^exp1 + sign0 * 1 (Solinas prime)
+ q, h:
+ r * h = q + 1
+ q is a prime, h is a multiple of 12 (thus q = -1 mod 12)
+
+Type A1 uses the same equation, but have different fields since the library
+is given r and cannot choose it.
+
++a1_param+ struct fields:
+
+ p, n, l:
+ p + 1 = n * l
+ p is prime, same as the q in a_param, n is the order of the group.
+
+=== Type B internals ===
+
+Unimplemented. Similar to type A. The curve y^2^ = x^3^ + 1 over the field F_q
+for some prime q = 2 mod 3, which implies cube roots in F_q are easy to
+compute, though we can achieve this for type A pairings by constraining q
+appropriately. I recommend requiring q = 3 mod 4 as well, so that -1 is
+a quadratic nonresidue.
+
+The lack of an x term simplifies some routines such as point doubling.
+
+It turns out we must choose between symmetry or efficiency due to the nature of
+a certain optimization.
+
+=== Type C internals ===
+
+Unimplemented. The supersingular curves y^2^ = x^3^ + 2x + 1 and
+y^2^ = x^3^ + 2x - 1 over a field of characteristic 3. Discussed at length
+by Boneh, Lynn, and Shacham, "Short signatures from the Weil pairing".
+Many optimizations can be applied to speed up these pairings; see
+Barreto et al., "Efficient algorithms for pairing-based cryptosystems", but
+sadly, an attack due to Coppersmith makes these curves less attractive.
+
+=== Type D internals ===
+
+These are ordinary curves of with embedding degree 6, whose orders are prime
+or a prime multiplied by a small constant.
+
+A type D curve is defined over some field F_q and has order h * r where
+r is a prime and h is a small constant. Over the field F_q^6^ its order is
+a multiple of r^2^.
+
+Typically the order of the curve E is around 170 bits, as is F_q, the base
+field, thus q^k^ is around the 1024-bit mark which is commonly considered
+good enough.
+
++d_param+ struct fields:
+
+ q F_q is the base field
+ n # of points in E(F_q)
+ r large prime dividing n
+ h n = h * r
+ a E: y^2 = x^3 + ax + b
+ b
+ nk # of points in E(F_q^k)
+ hk nk = hk * r * r
+ coeff0 coefficients of a monic cubic irreducible over F_q
+ coeff1
+ coeff2
+ nqr quadratic nonresidue in F_q
+
+These were discovered by Miyaji, Nakabayashi and Takano,
+"New explicit conditions of elliptic curve traces for FR-reduction".
+
+=== Type E Internals ===
+
+The CM (Complex Multiplication) method of constructing elliptic curves
+starts with the Diophantine equation
+
+ DV^2 = 4q - t^2
+
+If t = 2 and q = D r^2^ h^2^ + 1 for some prime r (which we choose to
+be a Solinas prime) and some integer h, we find that this equation is easily
+solved with V = 2rh.
+
+Thus it is easy to find a curve (over the field F_q) with order q - 1.
+Note r^2^ divides q - 1, thus we have an embedding degree of 1.
+
+Hence all computations necessary for the pairing can be done in F_q alone.
+There is never any need to extend F_q.
+
+As q is typically 1024 bits, group elements take a lot of space to represent.
+Moreover, many optimizations do not apply to this type, resulting in a slower
+pairing.
+
++e_param+ struct fields:
+
+ exp2, exp1, sign1, sign0, r:
+ r = 2^exp2 + sign1 * 2^exp1 + sign0 * 1 (Solinas prime)
+ q, h
+ q = h r^2 + 1 where r is prime, and h is 28 times a perfect square
+ a, b
+ E: y^2 = x^3 + ax + b
+
+=== Type F internals ===
+
+Using carefully crafted polynomials, k = 12 pairings can be constructed.
+Only 160 bits are needed to represent elements of one group, and 320 bits
+for the other.
+
+Also, embedding degree k = 12 allows higher security short signatures.
+(k = 6 curves cannot
+be used to scale security from 160-bits to say 256-bits because finite
+field attacks are subexponential.)
+
++f_param+ struct fields:
+
+ q:
+ The curve is defined over Fq
+ r:
+ The order of the curve.
+ b:
+ E: y^2= x^3 + b
+ beta:
+ A quadratic nonresidue in Fq: used in quadratic extension.
+ alpha0, alpha1:
+ x^6 + alpha0 + alpha1 sqrt(beta) is irreducible: used in sextic extension.
+
+Discovered by Barreto and Naehrig, "Pairing-friendly elliptic curves of prime order".
+
+=== Type G Internals ===
+
+Another construction based on the CM method.
+
++g_param+ struct fields:
+
+ q, n, h, r:
+ h * r = n is the order of E(F_q)
+ a, b:
+ E: y^2 = x^3 + ax + b
+ nk:
+ #E(F_q^10)
+ hk:
+ hk * r^2 = nk
+ coeff:
+ array of coefficients of polynomial used for quintic extension.
+ nqr:
+ a quadratic nonresidue
+
++g_param+ struct fields:
+
+Discovered by Freeman, "Constructing pairing-friendly elliptic curves with embedding degree 10."
+
+=== Type I Internals ===
+
+Type I pairings is symmetric, constructed on a supersingular curve
+y^2^ = x^3^ - x + 1 over a ternary extension field F_{3^m^}.
+The embedding degree k is 6.
+Both G1 and G2 are the group of points E(F_{3^m^}).
+GT is a subgroup of F_{3^6*m^}. The group order is a prime number.
+
+parameters:
+
+ m, t:
+ The ternary extension field is F(3)[x]/(x^m^ + x^t^ + 2).
+ n:
+ the order of G1
+ n2:
+ n * n2 = number of points in E(F_{3^m^})
+
+Introduced by Barreto et al, "Efficient Pairing Computation on Supersingular
+Abelian Varieties", Designs, Codes and Cryptography, vol. 42, no. 3, pp. 239-271,
+Mar. 2007.
+
+=== Testing functions ===
+
+For testing, debugging, demonstrations and benchmarks.
+Declared in +pbc_test.h+:
+
+include::gen/test.txt[]
+
+=== Dynamic arrays ===
+
+The +darray_t+ data type manages an array of pointers of type +void \*+,
+allocating more memory when necessary.
+Declared in +pbc_darray.h+.
+
+include::gen/darray.txt[]
+
+=== Symbol tables ===
+
+The +symtab_t+ data type manages symbol tables where the keys are strings of
+type +char \*+ and the values are pointers of type +void \*+.
+
+At present, they are implemented inefficiently using dynamic arrays, but this
+will change if the need arises. They are only used when reading a +pbc_param_t+
+from a string. Declared in +pbc_symtab.h+.
+
+include::gen/symtab.txt[]
+
+=== Religious stances ===
+
+I chose C because:
+
+- GMP, which PBC requires and is also modeled on, is also written in C.
+- PBC is intended to be a low-level portable cryptographic library. C is the
+ least common denominator. It should not be difficult to wrap PBC for other
+ languages.
+- Despite its drawbacks (I would appreciate operator overloading and
+ genericity, and to a lesser extent garbage collection), I've found few
+ languages I like better. To quote Rob Pike, C is the desert island language.
+ (I also agree with his statement that OO languages conceptually provide
+ little extra over judicious use of function pointers in C.)
+
+With respect to indentation, I'm migrating the code to follow
+http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml[Google C++
+Style Guide] to avoid having to switch styles all the time.
+The code was originally written using my old style: 4-space indent with 1TBS
+(One True Brace Style).
+
+I'd like to have no library dependencies (except standard C libraries),
+but then I'd have to write a large integer library. Furthermore, I'd have to
+write it in assembly, and then port it.
+
+To avoid this, I use an existing library. I selected GMP because the library's
+focus is on multiprecision arithmetic and nothing else, and it aims to be as
+fast as possible on many platforms. Another important factor is that GMP is
+released under a free license.
+
+On the other hand, GMP is written to deal with extremely large numbers, while I
+mostly only need integers that are roughly between 160 and 2048 bits. It is
+possible a library specializing in numbers of these sizes would be better for
+PBC.
+
+I'm fond of GMP's method for eliminating the need for the +&amp;+ and +*+
+operators most of the time by declaring a typedef on arrays of size 1. I try
+to do the same with PBC for consistency, though this trick does have drawbacks.
+
+I would like to have GMP as the only library dependency, though I do not mind
+using other libraries so long as they are optional. For example, one of the
+test programs is much easier to use if compiled with the GNU readline library,
+but by default compiles without it and is still functional.
+
+I dislike the C preprocessor. I like to place platform-specific code in
+separate files and let the build system work out which one to use. Integer
+constants can be defined with enum instead. I intend to minimize the number of
++#include+ statements in header files for PBC's internal use as much as
+possible (they should be in the `.c` files instead), and later perhaps even
+remove those annoying +#ifndef+ statements too.
+I grudgingly accept some macros for PBC's debugging features.
+
+I liberally use nested functions, a GNU C extension. I find their expressiveness so indispensable that I'm willing to sacrifice portability for them.
+
+The
+http://www.gnu.org/software/libc/manual/html_node/Reserved-Names.html[GNU libc manual]
+states that data types ending in +_t+ should not be used because they are
+reserved for future additions to C or POSIX. On the other hand, I want to stay
+consistent with GMP, and ending data types with +_t+ is common practice.