13

Die Aufgabe besteht einfach darin, herauszufinden, wie viel schneller Sie n choose n / 2 (für gerade n) berechnen können als die in Python integrierte Funktion. Natürlich ist dies für große n eine ziemlich große Zahl. Anstatt die ganze Zahl auszugeben, sollten Sie die Summe der Ziffern ausgeben. Zum Beispiel n = 100000lautet die Antwort 135702. Für n=1000000es ist 1354815.

Hier ist der Python-Code:

from scipy.misc import comb
def sum_digits(n):
   r = 0
   while n:
       r, n = r + n % 10, n / 10
   return r
sum_digits(comb(n,n/2,exact=True))

Ihre Punktzahl ist (highest n on your machine using your code)/(highest n on your machine using my code). Ihr Code muss in höchstens 60 Sekunden enden.

Ihr Programm muss für alle gerade n die richtige Ausgabe liefern: 2 <= n <= (Ihr höchstes n)

Sie können keinen eingebauten Code oder keine eingebauten Bibliotheken verwenden, die Binomialkoeffizienten oder Werte berechnen, die schnell in Binomialkoeffizienten umgewandelt werden können.

Sie können jede Sprache Ihrer Wahl verwenden.

Führende Antwort Die aktuelle führende Antwort mit erstaunlichen 680.09 ist nur halb so hoch.

fastest-code number-theory
quelle

2

Sollen wir Lösungen in Python oder in einer Sprache Ihrer Wahl einreichen?

Es ist möglich, eine Routine zu schreiben, die dies auf einem modernen Computer ausführt nund in die Millionen geht, während ich bezweifle, dass die Python-Funktion etwas Größeres bewältigen würde als n = 1e5ohne Würgen.

COTO

@Alessandro Sie können jede Sprache Ihrer Wahl verwenden. Die einzige Einschränkung war, dass Sie keine eingebauten Funktionen verwenden können, um die Koeffizienten zu berechnen.

2

Sind Fakultätsfunktionen erlaubt? Ich nahm nicht an, dass sie "schnell in Binomialkoeffizienten umgewandelt werden können" (das Ganze ist nur eine Fakultät geteilt durch eine andere Fakultät im Quadrat), aber da eine Antwort jetzt eine verwendet, wäre Klarheit schön.

Geobits

1

@Comintern: Ich habe diesen Bezugspunkt mit 287 mil in 1 min oder 169 mil in 35 s erfolgreich repliziert! :)

Hälfte

9

C ++ (GMP) - (287.000.000 / 422.000) = 680,09

Kombinieren Sie schamlos Kummers Theorem von xnor und GMP von qwr. ~~Immer noch nicht einmal in der Nähe der Go-Lösung, weiß nicht warum.~~

Bearbeiten: Danke Keith Randall für die Erinnerung, dass die Multiplikation schneller ist, wenn die Zahl in der Größe ähnlich ist. Ich habe Multi-Level-Multiplikation implementiert, ähnlich dem Memory-Coalescing-Konzept für die Speicherverwaltung. Das Ergebnis kann sich sehen lassen. Was früher 51 Sekunden dauerte, dauert jetzt nur noch 0,5 Sekunden (dh 100-fache Verbesserung !!)

ALTER CODE (n = 14.000.000)
Fertig gesiebt in 0.343s
Binom in 51.929s berechnet
Fertig mit der Summierung in 0.901s
14000000: 18954729

echte 0m53.194s
Benutzer 0m53.116s
sys 0m0.060s

NEUER CODE (n = 14.000.000)
Fertig gesiebt in 0.343s
Berechnung des Binoms in 0.552s abgeschlossen
Fertig mit der Summierung in 0.902s
14000000: 18954729

echte 0m1.804s
Benutzer 0m1.776s
sys 0m0.023s

Der Lauf nach n=287,000,000

Fertig gesiebt in 4.211s
Binom in 17.934s berechnet
Fertig Summierung in 37.677s
287000000: 388788354

echte 0m59.928s
Benutzer 0m58.759s
sys 0m1.116s

Der Code. Kompilieren mit-lgmp -lgmpxx -O3

#include <gmpxx.h>
#include <iostream>
#include <time.h>
#include <cstdio>

const int MAX=287000000;
const int PRIME_COUNT=15700000;

int primes[PRIME_COUNT], factors[PRIME_COUNT], count;
bool sieve[MAX];
int max_idx=0;

void run_sieve(){
    sieve[2] = true;
    primes[0] = 2;
    count = 1;
    for(int i=3; i<MAX; i+=2){
        sieve[i] = true;
    }
    for(int i=3; i<17000; i+=2){
        if(!sieve[i]) continue;
        for(int j = i*i; j<MAX; j+=i){
            sieve[j] = false;
        }
    }
    for(int i=3; i<MAX; i+=2){
        if(sieve[i]) primes[count++] = i;
    }
}

mpz_class sum_digits(mpz_class n){
    clock_t t = clock();
    char* str = mpz_get_str(NULL, 10, n.get_mpz_t());
    int result = 0;
    for(int i=0;str[i]>0;i++){
        result+=str[i]-48;
    }
    printf("Done summing in %.3fs\n", ((float)(clock()-t))/CLOCKS_PER_SEC);
    return result;
}

mpz_class nc2_fast(const mpz_class &x){
    clock_t t = clock();
    int prime;
    const unsigned int n = mpz_get_ui(x.get_mpz_t());
    const unsigned int n2 = n/2;
    unsigned int m;
    unsigned int digit;
    unsigned int carry=0;
    unsigned int carries=0;
    mpz_class result = 1;
    mpz_class prime_prods = 1;
    mpz_class tmp;
    mpz_class tmp_prods[32], tmp_prime_prods[32];
    for(int i=0; i<32; i++){
        tmp_prods[i] = (mpz_class)NULL;
        tmp_prime_prods[i] = (mpz_class)NULL;
    }
    for(int i=0; i< count; i++){
        prime = primes[i];
        carry=0;
        carries=0;
        if(prime > n) break;
        if(prime > n2){
            tmp = prime;
            for(int j=0; j<32; j++){
                if(tmp_prime_prods[j] == NULL){
                    tmp_prime_prods[j] = tmp;
                    break;
                } else {
                    mpz_mul(tmp.get_mpz_t(), tmp.get_mpz_t(), tmp_prime_prods[j].get_mpz_t());
                    tmp_prime_prods[j] = (mpz_class)NULL;
                }
            }
            continue;
        }
        m=n2;
        while(m>0){
            digit = m%prime;
            carry = (2*digit + carry >= prime) ? 1 : 0;
            carries += carry;
            m/=prime;
        }
        if(carries>0){
            tmp = 0;
            mpz_ui_pow_ui(tmp.get_mpz_t(), prime, carries);
            for(int j=0; j<32; j++){
                if(tmp_prods[j] == NULL){
                    tmp_prods[j] = tmp;
                    break;
                } else {
                    mpz_mul(tmp.get_mpz_t(), tmp.get_mpz_t(), tmp_prods[j].get_mpz_t());
                    tmp_prods[j] = (mpz_class)NULL;
                }
            }
        }
    }
    result = 1;
    prime_prods = 1;
    for(int j=0; j<32; j++){
        if(tmp_prods[j] != NULL){
            mpz_mul(result.get_mpz_t(), result.get_mpz_t(), tmp_prods[j].get_mpz_t());
        }
        if(tmp_prime_prods[j] != NULL){
            mpz_mul(prime_prods.get_mpz_t(), prime_prods.get_mpz_t(), tmp_prime_prods[j].get_mpz_t());
        }
    }
    mpz_mul(result.get_mpz_t(), result.get_mpz_t(), prime_prods.get_mpz_t());
    printf("Done calculating binom in %.3fs\n", ((float)(clock()-t))/CLOCKS_PER_SEC);
    return result;
}

int main(int argc, char* argv[]){
    const mpz_class n = atoi(argv[1]);
    clock_t t = clock();
    run_sieve();
    printf("Done sieving in %.3fs\n", ((float)(clock()-t))/CLOCKS_PER_SEC);
    std::cout << n << ": " << sum_digits(nc2_fast(n)) << std::endl;
    return 0;
}

nur zur Hälfte
quelle

2

Multiplikationen sind effizienter, wenn beide Operanden ungefähr gleich groß sind. Sie multiplizieren immer große und kleine Zahlen. Wenn Sie die kleinen Zahlen mehrmals paarweise kombinieren, ist dies möglicherweise schneller (benötigt jedoch mehr Speicherplatz).

Keith Randall

Wow, das macht einen großen Unterschied. Es ist exponentiell schneller. Ich kann jetzt in 35 Sekunden 169mil erreichen.

Nur die Hälfte des

Wow in der Tat! Was ist die zeitliche Aufteilung für die verschiedenen Teile Ihres Codes?

Ich habe das bereits in meiner Antwort dargelegt. 4s bei der Erzeugung von Primzahlen bis zu n18s bei der Berechnung des zentralen Binomialkoeffizienten und die restlichen 37s bei der Umwandlung des Ergebnisses in eine Zeichenfolge und der Summierung der Ziffer.

Nur die Hälfte des

1

Ich bin der Meinung, dass diese Antwort zu allen Open-Source-Bibliotheken beitragen sollte, die Binomialkoeffizienten berechnen. Ich kann nicht glauben, dass jemand so schnell Code hat!

7

Los, 33,96 = (16300000/480000)

package main

import "math/big"

const n = 16300000

var (
    sieve     [n + 1]bool
    remaining [n + 1]int
    count     [n + 1]int
)

func main() {
    println("finding primes")
    for p := 2; p <= n; p++ {
        if sieve[p] {
            continue
        }
        for i := p * p; i <= n; i += p {
            sieve[i] = true
        }
    }

    // count net number of times each prime appears in the result.
    println("counting factors")
    for i := 2; i <= n; i++ {
        remaining[i] = i
    }
    for p := 2; p <= n; p++ {
        if sieve[p] {
            continue
        }

        for i := p; i <= n; i += p {
            for remaining[i]%p == 0 { // may have multiple factors of p
                remaining[i] /= p

                // count positive for n!
                count[p]++
                // count negative twice for ((n/2)!)^2
                if i <= n/2 {
                    count[p] -= 2
                }
            }
        }
    }

    // ignore all the trailing zeros
    count[2] -= count[5]
    count[5] = 0

    println("listing factors")
    var m []uint64
    for i := 0; i <= n; i++ {
        for count[i] > 0 {
            m = append(m, uint64(i))
            count[i]--
        }
    }

    println("grouping factors")
    m = group(m)

    println("multiplying")
    x := mul(m)

    println("converting to base 10")
    d := 0
    for _, c := range x.String() {
        d += int(c - '0')
    }
    println("sum of digits:", d)
}

// Return product of elements in a.
func mul(a []uint64) *big.Int {
    if len(a) == 1 {
        x := big.NewInt(0)
        x.SetUint64(a[0])
        return x
    }
    m := len(a) / 2
    x := mul(a[:m])
    y := mul(a[m:])
    x.Mul(x, y) // fast because x and y are about the same length
    return x
}

// return a slice whose members have the same product
// as the input slice, but hopefully shorter.
func group(a []uint64) []uint64 {
    var g []uint64
    r := uint64(1)
    b := 1
    for _, x := range a {
        c := bits(x)
        if b+c <= 64 {
            r *= x
            b += c
        } else {
            g = append(g, r)
            r = x
            b = c
        }
    }
    g = append(g, r)
    return g
}

// bits returns the number of bits in the representation of x
func bits(x uint64) int {
    n := 0
    for x != 0 {
        n++
        x >>= 1
    }
    return n
}

Zählt alle Primfaktoren des Zählers und Nenners und hebt Übereinstimmungsfaktoren auf. Multipliziert die Reste, um das Ergebnis zu erhalten.

Mehr als 80% der Zeit wird für die Umstellung auf Basis 10 aufgewendet. Es muss einen besseren Weg geben, dies zu tun ...

Keith Randall
quelle

Bei Problemen, bei denen große Zahlen in der Basis 10 gedruckt werden müssen, ist es normalerweise hilfreich, eine eigene BigInteger-Klasse zu schreiben, die Zahlen in der Basis 1E9 ~ 2 ^ 30 speichert.

Peter Taylor

Sie gewinnen derzeit um eine Landmeile ... wie man so schön sagt.

@ PeterTaylor: Ich habe das versucht, aber es erfordert viel% 1e9 im Multiplikationscode, was die Multiplikation langsam macht.

Keith Randall

6

Python 3 (8,8 = 2,2 Millionen / 0,25 Millionen)

Dies ist in Python, das nicht für Geschwindigkeit bekannt ist, also können Sie dies wahrscheinlich besser in eine andere Sprache portieren.

Primes-Generator aus diesem StackOverflow-Wettbewerb .

import numpy
import time

def primesfrom2to(n):
    """ Input n>=6, Returns a array of primes, 2 <= p < n """
    sieve = numpy.ones(n//3 + (n%6==2), dtype=numpy.bool)
    for i in range(1,int(n**0.5)//3+1):
        if sieve[i]:
            k=3*i+1|1
            sieve[       k*k/3     ::2*k] = False
            sieve[k*(k-2*(i&1)+4)/3::2*k] = False
    return numpy.r_[2,3,((3*numpy.nonzero(sieve)[0][1:]+1)|1)]

t0 = time.clock()

N=220*10**4
n=N//2

print("N = %d" % N)
print()

print("Generating primes.")
primes = primesfrom2to(N)

t1 = time.clock()
print ("Time taken: %f" % (t1-t0))

print("Computing product.")
product = 1

for p in primes:
    p=int(p)
    carries = 0 
    carry = 0

    if p>n:
        product*=p
        continue

    m=n

    #Count carries of n+n in base p as per Kummer's Theorem
    while m:
        digit = m%p
        carry = (2*digit + carry >= p)
        carries += carry
        m//=p

    if carries >0:
        for _ in range(carries):
            product *= p

    #print(p,carries,product)

t2 = time.clock()
print ("Time taken: %f" % (t2-t1))

print("Converting number to string.")

# digit_sum = 0
# result=product

# while result:
    # digit_sum+=result%10
    # result//=10

digit_sum = 0
digit_string = str(product)

t3 = time.clock()
print ("Time taken: %f" % (t3-t2))

print("Summing digits.")
for d in str(digit_string):digit_sum+=int(d)

t4 = time.clock()
print ("Time taken: %f" % (t4-t3))
print ()

print ("Total time: %f" % (t4-t0))
print()
print("Sum of digits = %d" % digit_sum)

Die Hauptidee des Algorithmus ist die Verwendung des Kummer-Theorems , um die Primfaktorisierung des Binomials zu erhalten. Für jede Primzahl lernen wir die höchste Potenz, die die Antwort teilt, und multiplizieren das laufende Produkt mit dieser Potenz der Primzahl. Auf diese Weise müssen wir in der Primfaktorisierung der Antwort für jede Primzahl nur einmal multiplizieren.

Ausgabe mit zeitlicher Aufteilung:

N = 2200000
Generating primes.
Time taken: 0.046408
Computing product.
Time taken: 17.931472
Converting number to string.
Time taken: 39.083390
Summing digits.
Time taken: 1.502393

Total time: 58.563664

Sum of digits = 2980107

Überraschenderweise wird die meiste Zeit damit verbracht, die Zahl in eine Zeichenfolge umzuwandeln, um ihre Ziffern zu summieren. Auch überraschend, Konvertierung in einen String war viel schneller als Ziffern wiederholt immer %10und //10, obwohl die gesamte Zeichenfolge muss vermutlich im Speicher gehalten werden.

Das Generieren der Primzahlen nimmt vernachlässigbare Zeit in Anspruch (und daher fühle ich mich nicht unfair, wenn ich vorhandenen Code kopiere). Das Summieren von Ziffern ist schnell. Die tatsächliche Multiplikation dauert ein Drittel der Zeit.

Angesichts der Tatsache, dass die Ziffernsummierung der einschränkende Faktor zu sein scheint, würde ein Algorithmus zum Multiplizieren von Zahlen in Dezimaldarstellung insgesamt Zeit sparen, indem die Binär / Dezimal-Konvertierung verkürzt wird.

xnor
quelle

Dies ist sehr beeindruckend und Sie wundern sich, warum cpython Ihre Implementierung nicht verwendet!

3

Java (Punktzahl 22500/365000 = 0,062)

Ich habe kein Python auf diesem Rechner, also wäre ich dankbar, wenn jemand dies bewerten könnte. Wenn nicht, muss es warten.

(\binom{2 n}{n}) = \sum_{k = 0}^{n} {(\binom{n}{k})}^{2}

$\binom{2n}{n} = \sum_{k=0}^n \binom{n}{k}^2$

Der Engpass ist die Addition zur Berechnung des relevanten Abschnitts des Pascalschen Dreiecks (90% der Laufzeit), daher würde die Verwendung eines besseren Multiplikationsalgorithmus nicht wirklich helfen.

Beachten Sie, dass die Frage das nist, was ich anrufe 2n. Das Befehlszeilenargument ist das, was die Frage aufruft n.

public class CodeGolf37270 {
    public static void main(String[] args) {
        if (args.length != 1) {
            System.err.println("Usage: java CodeGolf37270 <n>");
            System.exit(1);
        }

        int two_n = Integer.parseInt(args[0]);
        // \binom{2n}{n} = \sum_{k=0}^n \binom{n}{k}^2
        // Two cases:
        //   n = 2m: \binom{4m}{2m} = \binom{2m}{m}^2 + 2\sum_{k=0}^{m-1} \binom{2m}{k}^2
        //   n = 2m+1: \binom{4m+2}{2m+1} = 2\sum_{k=0}^{m} \binom{2m+1}{k}^2
        int n = two_n / 2;
        BigInt[] nCk = new BigInt[n/2 + 1];
        nCk[0] = new BigInt(1);
        for (int k = 1; k < nCk.length; k++) nCk[k] = nCk[0];
        for (int row = 2; row <= n; row++) {
            BigInt tmp = nCk[0];
            for (int col = 1; col < row && col < nCk.length; col++) {
                BigInt replacement = tmp.add(nCk[col]);
                tmp = nCk[col];
                nCk[col] = replacement;
            }
        }

        BigInt central = nCk[0]; // 1^2 = 1
        int lim = (n & 1) == 1 ? nCk.length : (nCk.length - 1);
        for (int k = 1; k < lim; k++) central = central.add(nCk[k].sq());
        central = central.add(central);
        if ((n & 1) == 0) central = central.add(nCk[nCk.length - 1].sq());

        System.out.println(central.digsum());
    }

    private static class BigInt {
        static final int B = 1000000000;
        private int[] val;

        public BigInt(int x) {
            val = new int[] { x };
        }

        private BigInt(int[] val) {
            this.val = val;
        }

        public BigInt add(BigInt that) {
            int[] left, right;
            if (val.length < that.val.length) {
                left = that.val;
                right = val;
            }
            else {
                left = val;
                right = that.val;
            }

            int[] sum = left.clone();
            int carry = 0, k = 0;
            for (; k < right.length; k++) {
                int a = sum[k] + right[k] + carry;
                sum[k] = a % B;
                carry = a / B;
            }
            while (carry > 0 && k < sum.length) {
                int a = sum[k] + carry;
                sum[k] = a % B;
                carry = a / B;
                k++;
            }
            if (carry > 0) {
                int[] wider = new int[sum.length + 1];
                System.arraycopy(sum, 0, wider, 0, sum.length);
                wider[sum.length] = carry;
                sum = wider;
            }

            return new BigInt(sum);
        }

        public BigInt sq() {
            int[] rv = new int[2 * val.length];
            // Naive multiplication
            for (int i = 0; i < val.length; i++) {
                for (int j = i; j < val.length; j++) {
                    int k = i+j;
                    long c = val[i] * (long)val[j];
                    if (j > i) c <<= 1;
                    while (c > 0) {
                        c += rv[k];
                        rv[k] = (int)(c % B);
                        c /= B;
                        k++;
                    }
                }
            }

            int len = rv.length;
            while (len > 1 && rv[len - 1] == 0) len--;
            if (len < rv.length) {
                int[] rv2 = new int[len];
                System.arraycopy(rv, 0, rv2, 0, len);
                rv = rv2;
            }

            return new BigInt(rv);
        }

        public long digsum() {
            long rv = 0;
            for (int i = 0; i < val.length; i++) {
                int x = val[i];
                while (x > 0) {
                    rv += x % 10;
                    x /= 10;
                }
            }
            return rv;
        }
    }
}

Peter Taylor
quelle

Ich bekomme 29.500 für Ihr Programm und 440.000 für das Referenzprogramm, das wäre also eine Punktzahl von 0,067. Dies wird mit Java 1.7 ( javac CodeGolf37270.java) kompiliert und mit Java 1.8 ( java CodeGolf37270 n) ausgeführt. Ich bin nicht sicher, ob es Optimierungsoptionen gibt, die mir nicht bekannt sind. Ich kann nicht versuchen, mit Java 1.8 zu kompilieren, da es mit meinem Java-Paket nicht installiert wird ...

Dennis

Interessanter Ansatz. Warum könnte eine iterative Berechnung Ihrer Meinung nach schneller sein als die Verwendung der einfachen Formel?

Nur die Hälfte des

@justhalf, ich hatte keine Ahnung, ob es schneller sein würde oder nicht, und ich habe nicht versucht, Komplexitätsberechnungen durchzuführen. Ich habe Identitätslisten nach zentralen Binomialkoeffizienten durchsucht, um zu versuchen, Formeln zu finden, die mit einer benutzerdefinierten großen Ganzzahlklasse, die für das Extrahieren von 10er-Basisziffern optimiert ist, einfach zu implementieren sind. Und nachdem ich herausgefunden habe, dass es nicht sehr effizient ist, kann ich es auch posten und jemand anderen davor bewahren, das Experiment zu wiederholen. (FWIW Ich arbeite an der Toom-Multiplikation, bin mir aber nicht sicher, wann ich sie testen und debuggen lassen werde.)

Peter Taylor

2

GMP - 1500000/300000 = 5,0

Obwohl diese Antwort nicht mit Sieben konkurriert, kann ein kurzer Code manchmal dennoch zu Ergebnissen führen.

#include <gmpxx.h>
#include <iostream>

mpz_class sum_digits(mpz_class n)
{
    char* str = mpz_get_str(NULL, 10, n.get_mpz_t());
    int result = 0;
    for(int i=0; str[i]>0; i++)

    result += str[i] - 48;

    return result;
}


mpz_class comb_2(const mpz_class &x)
{
    const unsigned int k = mpz_get_ui(x.get_mpz_t()) / 2;
    mpz_class result = k + 1;

    for(int i=2; i<=k; i++)
    {
        result *= k + i;
        mpz_divexact_ui(result.get_mpz_t(), result.get_mpz_t(), i);
    }

    return result;
}

int main()
{
    const mpz_class n = 1500000;
    std::cout << sum_digits(comb_2(n)) << std::endl;

    return 0;
}

qwr
quelle

2

Java, benutzerdefinierte Big Integer-Klasse: 32.9 (120000000/365000)

Die Hauptklasse ist ziemlich einfach:

import java.util.*;

public class PPCG37270 {
    public static void main(String[] args) {
        long start = System.nanoTime();

        int n = 12000000;
        if (args.length == 1) n = Integer.parseInt(args[0]);

        boolean[] sieve = new boolean[n + 1];
        int[] remaining = new int[n + 1];
        int[] count = new int[n + 1];

        for (int p = 2; p <= n; p++) {
            if (sieve[p]) continue;
            long p2 = p * (long)p;
            if (p2 > n) continue;
            for (int i = (int)p2; i <= n; i += p) sieve[i] = true;
        }

        for (int i = 2; i <= n; i++) remaining[i] = i;
        for (int p = 2; p <= n; p++) {
            if (sieve[p]) continue;
            for (int i = p; i <= n; i += p) {
                while (remaining[i] % p == 0) {
                    remaining[i] /= p;
                    count[p]++;
                    if (i <= n/2) count[p] -= 2;
                }
            }
        }

        count[2] -= count[5];
        count[5] = 0;

        List<BigInt> partialProd = new ArrayList<BigInt>();
        long accum = 1;
        for (int i = 2; i <= n; i++) {
            for (int j = count[i]; j > 0; j--) {
                long tmp = accum * i;
                if (tmp < 1000000000L) accum = tmp;
                else {
                    partialProd.add(new BigInt((int)accum));
                    accum = i;
                }
            }
        }
        partialProd.add(new BigInt((int)accum));
        System.out.println(prod(partialProd).digsum());
        System.out.println((System.nanoTime() - start) / 1000000 + "ms");
    }

    private static BigInt prod(List<BigInt> vals) {
        while (vals.size() > 1) {
            int n = vals.size();
            List<BigInt> next = new ArrayList<BigInt>();
            for (int i = 0; i < n; i += 2) {
                if (i == n - 1) next.add(vals.get(i));
                else next.add(vals.get(i).mul(vals.get(i+1)));
            }
            vals = next;
        }
        return vals.get(0);
    }
}

Es basiert auf einer großen Ganzzahlklasse, die für die Multiplikation optimiert ist und toString()bei der es sich um erhebliche Engpässe bei einer Implementierung mit handelt java.math.BigInteger.

/**
 * A big integer class which is optimised for conversion to decimal.
 * For use in simple applications where BigInteger.toString() is a bottleneck.
 */
public class BigInt {
    // The base of the representation.
    private static final int B = 1000000000;
    // The number of decimal digits per digit of the representation.
    private static final int LOG10_B = 9;

    public static final BigInt ZERO = new BigInt(0);
    public static final BigInt ONE = new BigInt(1);

    // We use sign-magnitude representation.
    private final boolean negative;

    // Least significant digit is at val[off]; most significant is at val[off + len - 1]
    // Unless len == 1 we guarantee that val[off + len - 1] is non-zero.
    private final int[] val;
    private final int off;
    private final int len;

    // Toom-style multiplication parameters from
    // Zuras, D. (1994). More on squaring and multiplying large integers. IEEE Transactions on Computers, 43(8), 899-908.
    private static final int[][][] Q = new int[][][]{
        {},
        {},
        {{1, -1}},
        {{4, 2, 1}, {1, 1, 1}, {1, 2, 4}},
        {{8, 4, 2, 1}, {-8, 4, -2, 1}, {1, 1, 1, 1}, {1, -2, 4, -8}, {1, 2, 4, 8}}
    };
    private static final int[][][] R = new int[][][]{
        {},
        {},
        {{1, -1, 1}},
        {{-21, 2, -12, 1, -6}, {7, -1, 10, -1, 7}, {-6, 1, -12, 2, -21}},
        {{-180, 6, 2, -80, 1, 3, -180}, {-510, 4, 4, 0, -1, -1, 120}, {1530, -27, -7, 680, -7, -27, 1530}, {120, -1, -1, 0, 4, 4, -510}, {-180, 3, 1, -80, 2, 6, -180}}
    };
    private static final int[][] S = new int[][]{
        {},
        {},
        {1, 1, 1},
        {1, 6, 2, 6, 1},
        {1, 180, 120, 360, 120, 180, 1}
    };

    /**
     * Constructs a big version of an integer value.
     * @param x The value to represent.
     */
    public BigInt(int x) {
        this(Integer.toString(x));
    }

    /**
     * Constructs a big version of a long value.
     * @param x The value to represent.
     */
    public BigInt(long x) {
        this(Long.toString(x));
    }

    /**
     * Parses a decimal representation of an integer.
     * @param str The value to represent.
     */
    public BigInt(String str) {
        this(str.charAt(0) == '-', split(str));
    }

    /**
     * Constructs a sign-magnitude representation taking the entire span of the array as the range of interest.
     * @param neg Is the value negative?
     * @param val The base-B digits, least significant first.
     */
    private BigInt(boolean neg, int[] val) {
        this(neg, val, 0, val.length);
    }

    /**
     * Constructs a sign-magnitude representation taking a range of an array as the magnitude.
     * @param neg Is the value negative?
     * @param val The base-B digits, least significant at offset off, most significant at off + val - 1.
     * @param off The offset within the array.
     * @param len The number of base-B digits.
     */
    private BigInt(boolean neg, int[] val, int off, int len) {
        // Bounds checks
        if (val == null) throw new IllegalArgumentException("val");
        if (off < 0 || off >= val.length) throw new IllegalArgumentException("off");
        if (len < 1 || off + len > val.length) throw new IllegalArgumentException("len");

        this.negative = neg;
        this.val = val;
        this.off = off;
        // Enforce the invariant that this.len is 1 or val[off + len - 1] is non-zero.
        while (len > 1 && val[off + len - 1] == 0) len--;
        this.len = len;

        // Sanity check
        for (int i = 0; i < len; i++) {
            if (val[off + i] < 0) throw new IllegalArgumentException("val contains negative digits");
        }
    }

    /**
     * Splits a string into base-B digits.
     * @param str The string to parse.
     * @return An array which can be passed to the (boolean, int[]) constructor.
     */
    private static int[] split(String str) {
        if (str.charAt(0) == '-') str = str.substring(1);

        int[] arr = new int[(str.length() + LOG10_B - 1) / LOG10_B];
        int i, off;
        // Each element of arr represents LOG10_B characters except (probably) the last one.
        for (i = 0, off = str.length() - LOG10_B; off > 0; off -= LOG10_B) {
            arr[i++] = Integer.parseInt(str.substring(off, off + LOG10_B));
        }
        arr[i] = Integer.parseInt(str.substring(0, off + LOG10_B));
        return arr;
    }

    public boolean isZero() {
        return len == 1 && val[off] == 0;
    }

    public BigInt negate() {
        return new BigInt(!negative, val, off, len);
    }

    public BigInt add(BigInt that) {
        // If the signs differ, then since we use sign-magnitude representation we want to do a subtraction.
        boolean isSubtraction = negative ^ that.negative;

        BigInt left, right;
        if (len < that.len) {
            left = that;
            right = this;
        }
        else {
            left = this;
            right = that;

            // For addition I just care about the lengths of the arrays.
            // For subtraction I want the largest absolute value on the left.
            if (isSubtraction && len == that.len) {
                int cmp = compareAbsolute(that);
                if (cmp == 0) return ZERO; // Cheap special case
                if (cmp < 0) {
                    left = that;
                    right = this;
                }
            }
        }

        if (right.isZero()) return left;

        BigInt result;
        if (!isSubtraction) {
            int[] sum = new int[left.len + 1];
            // A copy here rather than using left.val in the main loops and copying remaining values
            // at the end gives a small performance boost, probably due to cache locality.
            System.arraycopy(left.val, left.off, sum, 0, left.len);

            int carry = 0, k = 0;
            for (; k < right.len; k++) {
                int a = sum[k] + right.val[right.off + k] + carry;
                sum[k] = a % B;
                carry = a / B;
            }
            for (; carry > 0 && k < left.len; k++) {
                int a = sum[k] + carry;
                sum[k] = a % B;
                carry = a / B;
            }
            sum[left.len] = carry;

            result = new BigInt(negative, sum);
        }
        else {
            int[] diff = new int[left.len];
            System.arraycopy(left.val, left.off, diff, 0, left.len);

            int carry = 0, k = 0;
            for (; k < right.len; k++) {
                int a = diff[k] - right.val[right.off + k] + carry;
                // Why did anyone ever think that rounding positive and negative divisions differently made sense?
                if (a < 0) {
                    diff[k] = a + B;
                    carry = -1;
                }
                else {
                    diff[k] = a % B;
                    carry = a / B;
                }
            }
            for (; carry != 0 && k < left.len; k++) {
                int a = diff[k] + carry;
                if (a < 0) {
                    diff[k] = a + B;
                    carry = -1;
                }
                else {
                    diff[k] = a % B;
                    carry = a / B;
                }
            }

            result = new BigInt(left.negative, diff, 0, k > left.len ? k : left.len);
        }

        return result;
    }

    private int compareAbsolute(BigInt that) {
        if (len > that.len) return 1;
        if (len < that.len) return -1;

        for (int i = len - 1; i >= 0; i--) {
            if (val[off + i] > that.val[that.off + i]) return 1;
            if (val[off + i] < that.val[that.off + i]) return -1;
        }

        return 0;
    }

    public BigInt mul(BigInt that) {
        if (isZero() || that.isZero()) return ZERO;

        if (len == 1) return that.mulSmall(negative ? -val[off] : val[off]);
        if (that.len == 1) return mulSmall(that.negative ? -that.val[that.off] : that.val[that.off]);

        int shorter = len < that.len ? len : that.len;
        BigInt result;
        // Cutoffs have been hand-tuned.
        if (shorter > 300) result = mulToom(3, that);
        else if (shorter > 28) result = mulToom(2, that);
        else result = mulNaive(that);

        return result;
    }

    BigInt mulSmall(int m) {
        if (m == 0) return ZERO;
        if (m == 1) return this;
        if (m == -1) return negate();

        // We want to do the magnitude calculation with a positive multiplicand.
        boolean neg = negative;
        if (m < 0) {
            neg = !neg;
            m = -m;
        }

        int[] pr = new int[len + 1];
        int carry = 0;
        for (int i = 0; i < len; i++) {
            long t = val[off + i] * (long)m + carry;
            pr[i] = (int)(t % B);
            carry = (int)(t / B);
        }
        pr[len] = carry;
        return new BigInt(neg, pr);
    }

    // NB This truncates.
    BigInt divSmall(int d) {
        if (d == 0) throw new ArithmeticException();
        if (d == 1) return this;
        if (d == -1) return negate();

        // We want to do the magnitude calculation with a positive divisor.
        boolean neg = negative;
        if (d < 0) {
            neg = !neg;
            d = -d;
        }

        int[] div = new int[len];
        int rem = 0;
        for (int i = len - 1; i >= 0; i--) {
            long t = val[off + i] + rem * (long)B;
            div[i] = (int)(t / d);
            rem = (int)(t % d);
        }

        return new BigInt(neg, div);
    }

    BigInt mulNaive(BigInt that) {
        int[] rv = new int[len + that.len];
        // Naive multiplication
        for (int i = 0; i < len; i++) {
            for (int j = 0; j < that.len; j++) {
                int k = i + j;
                long c = val[off + i] * (long)that.val[that.off + j];
                while (c > 0) {
                    c += rv[k];
                    rv[k] = (int)(c % B);
                    c /= B;
                    k++;
                }
            }
        }

        return new BigInt(this.negative ^ that.negative, rv);
    }

    private BigInt mulToom(int k, BigInt that) {
        // We split each number into k parts of m base-B digits each.
        // m = ceil(longer / k)
        int m = ((len > that.len ? len : that.len) + k - 1) / k;

        // Perform the splitting and evaluation steps of Toom-Cook.
        BigInt[] f1 = this.toomFwd(k, m);
        BigInt[] f2 = that.toomFwd(k, m);

        // Pointwise multiplication.
        for (int i = 0; i < f1.length; i++) f1[i] = f1[i].mul(f2[i]);

        // Inverse (or interpolation) and recomposition.
        return toomBk(k, m, f1, negative ^ that.negative, val[off], that.val[that.off]);
    }

    // Splits a number into k parts of m base-B digits each and does the polynomial evaluation.
    private BigInt[] toomFwd(int k, int m) {
        // Split.
        BigInt[] a = new BigInt[k];
        for (int i = 0; i < k; i++) {
            int o = i * m;
            if (o >= len) a[i] = ZERO;
            else {
                int l = m;
                if (o + l > len) l = len - o;
                // Ignore signs for now.
                a[i] = new BigInt(false, val, off + o, l);
            }
        }

        // Evaluate
        return transform(Q[k], a);
    }

    private BigInt toomBk(int k, int m, BigInt[] f, boolean neg, int lsd1, int lsd2) {
        // Inverse (or interpolation).
        BigInt[] b = transform(R[k], f);

        // Recomposition: add at suitable offsets, dividing by the normalisation factors
        BigInt prod = ZERO;
        int[] s = S[k];
        for (int i = 0; i < b.length; i++) {
            int[] shifted = new int[i * m + b[i].len];
            System.arraycopy(b[i].val, b[i].off, shifted, i * m, b[i].len);
            prod = prod.add(new BigInt(neg ^ b[i].negative, shifted).divSmall(s[i]));
        }

        // Handle the remainders.
        // In the worst case the absolute value of the sum of the remainders is s.length, so pretty small.
        // It should be easy enough to work out whether to go up or down.
        int lsd = (int)((lsd1 * (long)lsd2) % B);
        int err = lsd - prod.val[prod.off];
        if (err > B / 2) err -= B / 2;
        if (err < -B / 2) err += B / 2;
        return prod.add(new BigInt(err));
    }

    /**
     * Multiplies a matrix of small integers and a vector of big ones.
     * The matrix has a implicit leading row [1 0 ... 0] and an implicit trailing row [0 ... 0 1].
     * @param m The matrix.
     * @param v The vector.
     * @return m v
     */
    private BigInt[] transform(int[][] m, BigInt[] v) {
        BigInt[] b = new BigInt[m.length + 2];
        b[0] = v[0];
        for (int i = 0; i < m.length; i++) {
            BigInt s = ZERO;
            for (int j = 0; j < m[i].length; j++) s = s.add(v[j].mulSmall(m[i][j]));
            b[i + 1] = s;
        }
        b[b.length - 1] = v[v.length - 1];

        return b;
    }

    /**
     * Sums the digits of this integer.
     * @return The sum of the digits of this integer.
     */
    public long digsum() {
        long rv = 0;
        for (int i = 0; i < len; i++) {
            int x = val[off + i];
            while (x > 0) {
                rv += x % 10;
                x /= 10;
            }
        }
        return rv;
    }
}

Der große Engpass ist die naive Multiplikation (60%), gefolgt von der anderen Multiplikation (37%) und dem Sieben (3%). Der digsum()Anruf ist unbedeutend.

Leistung gemessen mit OpenJDK 7 (64 Bit).

Peter Taylor
quelle

Sehr schön. Vielen Dank.

1

Python 2 (PyPy): 1.134.000 / 486.000 = 2,32

#/!usr/bin/pypy
n=input(); a, b, c=1, 1, 2**((n+2)/4)
for i in range(n-1, n/2, -2): a*=i
for i in range(2, n/4+1): b*=i
print sum(map(int, str(a*c/b)))

Ergebnis: 1.537.506

Unterhaltsame Tatsache: Der Engpass in Ihrem Code besteht darin, die Ziffern zu addieren und nicht den Binomialkoeffizienten zu berechnen.

Dennis
quelle

Warum ist Python beim Hinzufügen von Ziffern so langsam? Sie und xnor sagen, dass es ist. Es hat mich neugierig gemacht, also habe ich meine getaktet. Es kam in weniger als einer Sekunde für den Summenteil (Java).

Geobits

@ Geobits Hmm, neugierig. Kann Java auch Binär-Dezimal-Konvertierungen ähnlich schnell durchführen? Es repräsentiert ganze Zahlen in binären Zahlen, richtig?

Xnor

Das ist eine gute Frage. Für Integer / Integer / Long / Long weiß ich, dass es binär ist. Ich bin nicht genau sicher, wie die interne Darstellung einer BigInteger ist. Wenn es dezimal ist, würde das definitiv erklären, warum es in Mathe langsam, aber schnell ist, in einen String zu konvertieren. Kann das morgen nachschlagen.

Geobits

@ Geobits, die interne Darstellung von BigInteger ist Basis 2.

Peter Taylor

Das habe ich immer angenommen, aber ich habe mich gewundert. Es sieht so aus, als würde es in lange Teile zerlegt und auf diese Weise konvertiert, zumindest in OpenJDK.

Geobits

1

Java (2.020.000 / 491.000) = 4,11

aktualisiert, zuvor 2.24

Java BigIntegerist nicht der schnellste Cruncher, aber besser als nichts.

Die Grundformel dafür scheint zu sein n! / ((n/2)!^2), aber das scheint ein Haufen redundanter Multiplikation zu sein.

Sie können eine erhebliche Beschleunigung erzielen, indem Sie alle Primfaktoren entfernen, die sowohl im Zähler als auch im Nenner enthalten sind. Dazu starte ich zuerst ein einfaches Hauptsieb. Dann zähle ich für jede Primzahl, zu welcher Kraft sie erhöht werden muss. Erhöhe jedes Mal, wenn ich einen Faktor im Zähler sehe, um den Nenner.

Ich behandle zwei getrennt (und zuerst), da es einfach ist, sie vor dem Factoring zu zählen / zu eliminieren.

Sobald dies erledigt ist, haben Sie die erforderliche Mindestmenge an Multiplikationen, was gut ist, weil die BigInt-Multiplikation langsam ist .

import java.math.BigInteger;
import java.util.ArrayList;
import java.util.List;

public class CentBiCo {
    public static void main(String[] args) {
        int n = 2020000;
        long time = System.currentTimeMillis();
        sieve(n);
        System.out.println(sumDigits(cbc(n)));
        System.out.println(System.currentTimeMillis()-time);
    }

    static boolean[] sieve;
    static List<Integer> primes;
    static void sieve(int n){
        primes = new ArrayList<Integer>((int)(Math.sqrt(n)));
        sieve = new boolean[n];
        sieve[2]=true;
        for(int i=3;i<sieve.length;i+=2)
            if(i%2==1)
                sieve[i] = true;
        for(int i=3;i<sieve.length;i+=2){
            if(!sieve[i])
                continue;
            for(int j=i*2;j<sieve.length;j+=i)
                sieve[j] = false;
        }
        for(int i=2;i<sieve.length;i++)
            if(sieve[i])
                primes.add(i);
    }

    static int[] factors;
    static void addFactors(int n, int flip){
        for(int k=0;primes.get(k)<=n;){
            int i = primes.get(k);
            if(n%i==0){
                factors[i] += flip;
                n /= i;
            } else {
                if(++k == primes.size())
                    break;
            }
        }
        factors[n]++;
    }

    static BigInteger cbc(int n){
        factors = new int[n+1];
        int x = n/2;
        for(int i=x%2<1?x+1:x+2;i<n;i+=2)
            addFactors(i,1);
        factors[2] = x;
        for(int i=1;i<=x/2;i++){
            int j=i;
            while(j%2<1 && factors[2] > 1){
                j=j/2;
                factors[2]--;
            }
            addFactors(j,-1);
            factors[2]--;
        }
        BigInteger cbc = BigInteger.ONE;
        for(int i=3;i<factors.length;i++){
            if(factors[i]>0)
                cbc = cbc.multiply(BigInteger.valueOf(i).pow(factors[i]));
        }
        return cbc.shiftLeft(factors[2]);
    }

    static long sumDigits(BigInteger in){
        long sum = 0;
        String str = in.toString();
        for(int i=0;i<str.length();i++)
            sum += str.charAt(i)-'0';
        return sum;
    }
}

Oh, und die Ausgabesumme für n = 2020000 ist 2735298zu Überprüfungszwecken.

Geobits
quelle

Ziffernsumme der zentralen Binomialkoeffizienten

Antworten:

C ++ (GMP) - (287.000.000 / 422.000) = 680,09

Los, 33,96 = (16300000/480000)

Python 3 (8,8 = 2,2 Millionen / 0,25 Millionen)

Java (Punktzahl 22500/365000 = 0,062)

GMP - 1500000/300000 = 5,0

Java, benutzerdefinierte Big Integer-Klasse: 32.9 (120000000/365000)

Python 2 (PyPy): 1.134.000 / 486.000 = 2,32

Ergebnis: 1.537.506

Java (2.020.000 / 491.000) = 4,11