Умножение целых чисел, когда одно целое фиксировано

35

Пусть $A$ будет фиксированным положительным целым числом размером $n$ бит.

Разрешается предварительно обрабатывать это целое число соответствующим образом.

Учитывая другое положительное целое число $B$ размером $m$ битов, какова сложность умножения $AB$ ?

Обратите внимание, что у нас уже есть $(\max(n,m))^{1+\epsilon}$ алгоритмов. Вопрос здесь в том, можем ли мы взять $\epsilon=0$ чем-нибудь более умным?

cc.complexity-theory ds.algorithms circuit-complexity algebraic-complexity arithmetic-circuits Т ....
источник

6

Given

A

$A$ , just construct a lookup table with

2^{n}

$2^n$ entries. (Obviously this isn't what you wanted, but I think you should make your requirements more specific...)

Jukka Suomela

9

I think that the question makes perfect sense in the standard Boolean circuit model.

Noam

4

Could you summarize the obvious upper and lower bounds, and the best results that you are aware of? It shows that you care about the problem and that you've thought about it, and it gives everyone else more incentive to think about your problem.

Robin Kothari

4

I think the asker implicitly means that the preprocessed part must take only

n^{O (1)}

$n^{O(1)}$ space. (The matrix-vector mult has that property.)

Ryan Williams

I'd like to know just what you'd like; I feel like I could go through endless cases on this. This is my first answer, so I'm especially happy to try to give you as much information as I can. If you'd like, you can email me at mgroff100@hotmail.com, and I'd be happy to work with you much more.

Matt Groff

20

While it won't always be the most efficient algorithm, this question has a very close relationship with addition chains; any algorithm for computing $A$ quickly by addition chains translates into an algorithm for computing $f(B) = AB$ by repeated addition (each addition, of course, being an $O(n)$ operation). Contrariwise, a quick algorithm for computing $AB$ for any $B$ leads to a quick algorithm for computing $A$ , but of course this algorithm doesn't necessarily have to have the form of an addition chain; still, that seems like an excellent place to start. Have a look at http://en.wikipedia.org/wiki/Addition_chain or check out vol. 2 of The Art Of Computer Programming for more details.

Steven Stadnicki
источник

17

To expand upon Steven Stadnicki's idea, we can quicky construct a naive algorithm that does better than matrix multiplication using the Discrete Fourier Transform.

We count the number of ones in $A$ . If less than half the bits are ones, we construct a linked list of their positions. To multiply, we simply shift $B$ left by each position in the list (multiplying by that bit that's represented) and add the results.

If more than half of the bits are ones, we do the same as above, but we use the zeros instead to populate the list of positions. The idea is that we'll subtract this sum from the sum that would be obtained by multiplying by all ones. To get the sum of all ones, we shift $B$ by the number of bits in $A$ and subtract $B$ from this. Then we can subtract our sum obtained from the linked list.

We can call that the naive linked-list algorithm. Its running time is $O(n^2)$ in the worst case, but $O(|B| \sqrt{\frac{|A|}{2\pi}})$ in the average case, which is faster than DFT for small $|A|$ .

To use the idea of lists optimally, we use divide-and-conquer. We split $A$ in half, and find the sizes of the associated lists using the naive algorithm. If they're greater than 5, we call the naive algorithm again on halves greater than 5 until we manage to cut all halves to less than five. (This is because we can reduce this to 4 subtractions)

Even better still, we improve our divide-and-conquer algorithm. We iterate through all possible combinations of branching, greedily picking the best one. This preprocessing takes approximately the same time as the actual multiplication.

If we are allowed infinite freedom with pre-processing, we solve the optimized divide-and-conquer algorithm for all branches optimally. This takes time $O(2^{|A|})$ in the worst case, but it should be ~optimal by addition chain methods.

I'm working on calculating more exact values for the above algorithms.

Matt Groff
источник

Hi Matt: What is

| A |

$|A|$ and

| B |

$|B|$ ?

T....

| A |

$|A|$ is the size of

A

$A$ , basically the number of elements in

A

$A$ . This is equivalent to your

n

$n$ , i.e.

| A | \equiv n

$|A| \equiv n$ . The same for

| B |

$|B|$ . However, this formula still holds when

n

$n$ is different for

A

$A$ and

B

$B$ .

Matt Groff

17

The paper called Multiplication by a constant is sublinear (PDF) gives an algorithm for $\mathcal{O}\left(\frac{n}{\log n}\right)$ shift/addition operations, where $n$ is the size of the constant.

Essentially, it works by looking for the $1$ -bits in the constant, shifting and adding the number to be multiplied only for those $1$ bits in the constant (like long multiplication for binary, where a $0$ bit in the bottom number to be multiplied means the top is not shifted and added, while a $1$ bit means the top is shifted and added). However this is still $\mathcal{O}\left(n\right)$ , because there can be $\mathcal{O}(n)$ $1$ -bits in the constant.

The paper then talks about changing the number representation of the constant into the double-base number system, where apparently, the non- $0$ -bits are sparser, if the conversion is done correctly (it is a very redundant number system). They calculate just how sparse it is; the number of non-zero bits is bounded to less than $\mathcal{O}(n)$ , thus there is a sublinear number of additions required. However it is still $\mathcal{O}\left(\frac{nm} {\log n}\right)$ actual operations, due to the $\mathcal{O}(m)$ cost of each addition (where $n$ is the size of the constant, and $m$ is the size of the other number).

So to answer your question, yes there is a similar result to matrix-vector multiplication, in that you get a $\log n$ speedup if it is constant; but of course this speedup is only over naive long-multiplication, and there exists multiplication algorithms that are far better than $\mathcal{O}\left(\frac{n^2} {\log n}\right)$ you can get with this algorithm.

Realz Slaw
источник

@JAS that is my specialty :D.

Realz Slaw

3

This appeared in ARITH 2007 as dx.doi.org/10.1109/ARITH.2007.24 (for completeness).

András Salamon

10

As suggested by Matt Groff, you may be interested to look into the practice community for inspirations (or if $n$ in your situation is within the bit width of a current CPU). Indeed, the problem of integer multiplication by a constant has been considered by many compiler writers and circuit designers, although they are usually interested in "multipler-less multipler" (multiply using shift, add, and subtract). One of the early references I am aware of is (I learned this from Hacker's Delight section 8.4.):

Bernstein, R. (1986), Multiplication by integer constants. Software: Practice and Experience, 16: 641–652. doi: 10.1002/spe.4380160704

More modern work by Vincent Lefèvre can be found here (be sure to see follow-up works to his), and he also notes a CMU project on efficient circuit synthesis (see the references there). The latter project even considers simultaneous multiplication by a set of constants.

P.S. I encourage you to consider changing your username to something recognizable.

Maverick Woo
источник

9

I am not sure whether this is directly relevant to the question, but the following elementary result might be of interest. Given a fixed natural number $k$ , the operation $n \to kn$ can be realized by a sequential automaton, provided that $n$ is written in reversed binary notation (that is, Least Significant Bit First). The number of states of the automaton is $k/2^r$ where $2^r$ is the largest power of $2$ dividing $k$ . For instance, the operation $n \to 6n$ is realized by the following automaton.

For instance, $185 = 1 + 8 + 16 + 32 + 128$ and $6 \times 185 = 1110 = 2 + 4 + 16 + 64 + 1024$ . Thus, in reverse binary, $185$ is written as $10011101$ and $1110$ (bad choice, I know...) as $01101010001$ . Processing the entry $\color{red}{10011101}$ on this automaton gives the path

\overset{0}{\to} 0 \overset{1 ∣ 1}{\to} 1 \overset{0 ∣ 1}{\to} 0 \overset{0 ∣ 0}{\to} 0 \overset{1 ∣ 1}{\to} 1 \overset{1 ∣ 0}{\to} 2 \overset{1 ∣ 1}{\to} 2 \overset{0 ∣ 0}{\to} 1 \overset{1 ∣ 0}{\to} 2 \overset{01}{\to}

$\xrightarrow{\color{green}0} 0 \xrightarrow{\color{red}1\mid \color{blue}1} 1 \xrightarrow{\color{red}0\mid \color{blue}1} 0 \xrightarrow{\color{red}0\mid \color{blue}0} 0 \xrightarrow{\color{red}1\mid \color{blue}1} 1 \xrightarrow{\color{red}1\mid \color{blue}0} 2 \xrightarrow{\color{red}1\mid \color{blue}1} 2 \xrightarrow{\color{red}0\mid \color{blue}0} 1 \xrightarrow{\color{red}1\mid \color{blue}0} 2 \xrightarrow{\color{green}{01}}$ which gives the correct output

01101010001

$\color{blue}{01101010001}$ . The type of sequential automaton I am using here was called subsequential by Schützenberger: as you can see there is an initial prefix (in green) and a terminal output function (also in green). For more details how this sequential machine can be computed in a systematic way, see this link.

J.-E. Pin
источник

Could you elaborate on your comment?

J.-E. Pin

k

$k$ is prime

> 2

$>2$ .

T....

The construction of a sequential automaton realizing the operation

n \to k n

$n \to kn$ can be done for any

k

$k$ . However, computing this automaton might be easier for

k

$k$ prime.

J.-E. Pin

\frac{k}{2^{r}}

$\frac{k}{2^r}$ is exponential.

T....

k

$k$ is a fixed constant, not related to

n

$n$ .

J.-E. Pin

Умножение целых чисел, когда одно целое фиксировано

Ответы: