The IBM 701

"IBM's first computer"

Feb 13, 2023

(Author’s note: This is my longest blog post on Substack so far, and I would very much welcome comments, corrections, improvements, or any kind of feedback at all. I’ll notify subscribers of changes I make following initial publication.)

Machine Translation :: From the Cold War to Deep Learning :: vas3k.com

Photo: [7]

History

The IBM 701 was introduced in 1952 and was IBM’s first (mass-produced, stored-program) computer.

The IBM 701 was mass-produced, with about 20 systems installed across the United States. It followed IBM’s experimental stored-program Selective Sequence Electronic Calculator, or “SSEC.” A single prototype SSEC was built and became operational in 1948, but it never left IBM.
Earlier data-processing machines had external plugboards (confusingly also called control panels) to program their operation. They were not stored-program computers like modern computers, with program and data sitting alongside each other in the same memory.1 (The IBM 701 did have its own external plugboards, but they were used only to configure its card reader, card punch, and printer.)
The IBM 701 was first called the Defense Calculator, then announced as the IBM 701 Electronic Data Processing Machine. IBM did not call it a “computer,” still holding to the old definition of “computers” as the people doing computations. This historic distinction faded quickly, and a “computer” became just another kind of machine, albeit a very useful one.

The IBM 701’s architecture is not unlike that of modern computers, but several aspects can seem quite surprising from the modern viewpoint. Today’s readers should remember that the IBM 701 had millions of times less power and capacity than a modern smartphone; it can seem amazing that it worked at all, but it was a significant commercial success.

IBM had been a powerhouse in “semi-automatic” data processing systems since its inception in 1911 as the “Computing-Tabulating-Recording Company,” with a history tracing all the way back to Herman Hollerith’s late nineteenth-century work with the US Census Bureau.2 Its data-processing machines until the 1950s were electromechanical, mostly using punched cards for bulk data I/O and relays for logic. The IBM 701 was IBM’s first (mass-produced, stored-program) computer.

“I think there is a world market for maybe five computers.”
— Thomas J. Watson, Chairman and CEO of IBM, no one, ever

Competitors to IBM, including the well funded upstart Univac, had announced stored-program computers before IBM, and IBM could be viewed as a follower in this field. Some people believed that Thomas J. Watson, IBM’s Chairman and CEO, had earlier dismissed the entire computer market with the short and extraordinarily wrong verdict, “I think there is a world market for maybe five computers,” but this is an urban myth; here’s a real quote instead:

“IBM had developed a paper plan for such a machine and took this paper plan across the country to some 20 concerns that we thought could use such a machine…. As a result of our trip, on which we expected to get orders for five machines, we came home with orders for 18.”
— Thomas J. Watson, Chairman and CEO of IBM, 1953, describing to shareholders the planning for the IBM 701 [10]

Watson later elaborated that “we thought in those early days that we either had to win this one or fail as a company — and that's why I think everybody put such effort into it.” [9]

In other words, although the market was indeed very small in its early days—just 5–18 computers—it has of course grown exponentially ever since computers became commercially available.

The large customer interest in the IBM 701 made it a distinct commercial success and helped steer IBM to dominate the computer market for decades to come.

Architecture

Data formats

The IBM 701 represented numbers in binary, with any conversions from/to decimal left to software. This contrasted with other early computers like the Univac Ⅰ, which typically held numbers as character strings in memory.[4]

The Univac Ⅰ was faster than the IBM 701 at reading and writing data with very little processing.
The IBM 701 was much faster at processing, which dominated scientific workloads; any extra cost for conversion to/from binary was swamped by the overall speedup.

The IBM 701’s memory held 2,048 36-bit words, which could also be addressed as 4,096 18-bit half-words. Numbers were stored as signed-magnitude (!) fixed-point values, ranging in value from -2³⁵ to 2³⁵ (for 36-bit words), or from -2¹⁷ to 2¹⁷ (for 18-bit half-words).

Fetching a half-word from memory interpreted the 18-bit value as the high-order (!) half of a 36-bit word, with the low-order bits set to zero. This helped minimize the IBM 701’s instruction set relative to the more modern alternative of interpreting it as the low-order bits.

There was exactly one addressing mode (!), with no index registers (!). Addresses were physical (!), with no virtual memory. Computing and using an address to be computed at runtime (e.g., when indexing into an array) required self-modifying code (!).3

There were only two registers: the 36-bit general-purpose Accumulator (AC),4 and the more special-purpose 36-bit Multiplier-Quotient (MQ). A few instructions treated the AC and the MQ as a single signed number with a 70-bit magnitude, with the MQ holding the lower-precision 35 bits, and ignoring the MQ’s sign.

The IBM 701 had no hardware support for floating-point arithmetic.5

Note that the IBM 701’s signed-magnitude values could hold both a positive zero and a negative zero (!), providing greater programmer flexibility at the cost of greater programmer complexity.6

“It is possible for a zero in this calculator to have either a plus or a minus sign…. This characteristic of the machine is sometimes very convenient….”
—Principles of Operation, Type 701 and Associated Equipment, IBM, 1953 [5]

Instruction format

The IBM 701’s instructions were 18 bits long, with a 5-bit opcode field and a 12-bit address field, and a sign bit (!).

The 5-bit opcode field allowed 33 (!) distinct operations.7
The address field was 12 bits long, plus the instruction’s sign bit (!). It usually represented a location in memory (but sometimes held an immediate value like the bit-count for shifts).
- A positive address referenced any of 2¹² (4,096) half-words, with the low-order bit of the address specifying which half.
- A negative address referenced any of the 2¹¹ (2,048) corresponding full 36-bit words, with the address’s lower-order bit ignored.

Implementation

The IBM 701’s logic circuitry used vacuum tubes; it was therefore a first-generation computer. (Transistors has been invented by this time but none were used in the IBM 701.)

Memory was electrostatic, with 72 Williams Tubes of 1,024 bits each.8 This provided a high-speed random-access memory (RAM), but without error detection or correction.

The IBM 701 could have up to four drum drives, each holding 2,048 words. Individual words on the drum could be read or written—with the drum addresses structured the same as memory addresses—or a carefully written tight loop could more quickly read or write a sequential range of words.

The IBM 701 executed a single instruction at a time, finishing each instruction before even fetching the next. This made it slower than modern computers that use pipelining and similar techniques, but it greatly simplified the execution of self-modifying code.

The IBM 701’s cycle time was 12µsec. The simplest instructions took 2–4 cycles; arithmetic instructions (which also read memory) took 3–5 cycles; stores took 2–5 cycles; shifts took 4 cycles; multiplies and divides took 38 cycles but could speed up or slow down the instructions following (presumably saving on circuitry). Thus, an add or subtract instruction took about 60µsec, while a multiply or divide instruction took 456µsec; stated differently, the IBM 701 could execute almost 17K add instructions per second, or over 2K multiply instructions.

An IBM 701 system was built from the following components:

The IBM 701 Electronic Analytical Control Unit (shown here with the operator console toward the right)
The IBM 706 Electrostatic Storage Unit (with the screens of the 72 Williams tubes visible externally, as seen here)
The IBM 711 Punched Card Reader
The IBM 716 Alphabetical Printer
The IBM 721 Punched Card Recorder
The IBM 726 Magnetic Tape Readers and Recorder
The IBM 731 Magnetic Drum Reader and Recorder
Photos: [6]

An IBM 701 installation was relatively small compared to others of the time, but it still occupied about a thousand square feet (or a hundred square meters), including the necessary walkways between the components.

Photo: [3]

The IBM 701 is best remembered today as the first of IBM’s long-lived 700-7000 series of scientific computers, including the IBM 704, the IBM 709, the IBM 7090, the IBM 7094, the IBM 7094 Model Ⅱ, the IBM 7040,9 and the IBM 7044. These machines were not strictly binary-compatible, as the architectures expanded and contracted over time. These were IBM's scientific computers from their introduction until the introduction of the IBM System/360 in 1964.

IBM also made several very different computers with similar model numbers intended for business data processing, not scientific, including the IBM 702, the IBM 705, the IBM 7010, the IBM 7070, the IBM 7072, the IBM 7074, and the IBM 7080; these used decimal arithmetic, not binary.
Finally, the IBM 7030 (“Stretch”) was a mostly unrelated IBM supercomputer during the same time period, that suffered from over-reaching goals (e.g., a “speed at least 100 times the IBM 704” [2]). The IBM 7030 had far fewer customers than first expected and was quickly withdrawn from the market.

References and further reading

(All web references were last fetched February, 2023.)

Bashe, Charles J.; Lyle R. Johnson; John H. Palmer; Emerson W. Pugh, IBM's Early Computers, MIT Press (1985)
Buchholz, Werner, editor. Planning a Computer System, McGraw-Hill Book Company (1962)
Columbia University Computing History, “The IBM 701 Defense Calculator”
Goldfinger, Roy, “Comparison of UNIVAC With IBM 701,” Courant Institute of Mathematical Sciences, New York University (1954)
International Business Machines, “Principles of Operation, Type 701 and Associated Equipment”
International Business Machines, “701 Photo Album”
Selected Technologies, “11 Feb 1953 — IBM 701” (in Spanish)
von Neumann, John, First Draft of a Report on the EDVAC (1945)
Watson, Thomas J., “Explaining the 701's significance — Transcript”
Wikipedia, “IBM 701”

The stored-program computer architecture has of course let software complexity increase exponentially ever since. This is the “von Neumann architecture,” named after John von Neumann; although von Neumann did not invent this architecture, he documented its presence on the early EDVAC computer. [8]

Hollerith’s punched-cards tabulation machines had reduced the time to process the data from the US census from eight years for the Tenth Census (1880) to only six years for the Eleventh Census (1890), even with less total funding, and also earned Hollerith his PhD from Columbia University.

Self-modifying code essentially required stored-program control, so the ability to execute self-modifying code was no doubt seen as as advantage on the IBM 701. Self-modifying code is anathema to today’s programmers, since it can make programs very hard to understand, but yesterday’s programmers were of course of a heartier stock.

The Accumulator also had two associated overflow bits, and was described as a 38-bit register in most sources and related materials.

Choices in the IBM 701’s instruction-set architecture simplified the implementation of scaled or floating-point arithmetic in software. Interestingly, the immediate forerunner of the FORTRAN programming language was developed for the IBM 701 and so depended on software floating-point; FORTRAN was developed for the later IBM 704, which had added hardware floating-point support to support FORTRAN.

Most modern computers use two’s complement, which has only a single zero but which unfortunately allows one more negative number than there are positive numbers, adding to programmer complexity in a different way.

This sort of accounting gimmick is of course common in modern processors too.

The Williams Tube wrote bits onto the surface of a CRT, which also created a small electrical charge for a fraction of a second. Subsequent writes at the same location could sense the existing static charge there; a read was implemented by writing a dummy value to sense the previous value from the location’s static charge, then rewriting that original value. The contents of each tube were automatically read out and rewritten periodically so that bits written arbitrarily long ago could still be read. Each Williams Tube stored at most a single bit of a given word or half-word, presumably simplifying the circuitry.

The author has written this post preparatory to a post on the IBM 7040 and 7044, then others. He welcomes your comments.

Paul McJones

Feb 23, 2023

The earliest report from the Fortran project is https://www.softwarepreservation.org/projects/FORTRAN/BackusEtAl-Preliminary%20Report-1954.pdf; it says "The IBM Mathematical Formula Translating System or briefly, FORTRAN, will comprise a large set of programs to enable the IBM 704 to accept a concise formulation of a problem in terms of mathematical notation and to produce automatically a high speed 704 program for the solution of the problem." I worked with John in 1974-1975, and I remember him saying he'd consulted with Gene Amdahl (704 architect) on floating-point and index registers, and realized that with those features in the hardware, it was time for higher-level programmer assistance. And In fact the first Fortran compiler introduced a variety of optimization techniques (reduction in strength, etc.) -- see the softwarepreservation.org site for references.

Re BCPL: This is an excellent topic -- one of the first "system programming" languages (but also consider the various Algol 58 dialects -- see https://www.softwarepreservation.org/projects/ALGOL/algol58impl/), easily and widely ported, inspiration for Thompson and Ritchie, etc.

Re CPL: see http://www.ancientgeek.org.uk/CPL/ .

Re: TX-2: Definitely mention its influence on the Xerox Alto. (This was one of the few times Bob Taylor felt it appropriate to inform Chuck Thacker of prior art.) The subsequent D machines also multitasked their microprocessor for I/O, but were much more complex. I think there are references for the details (see http://www.bitsavers.org/pdf/xerox/, but it's likely not something you want to pursue).

Re Stretch/HARVEST : I don't know anything about it, but Fran Allen worked on it -- see https://www.computerhistory.org/collections/catalog/102621818 .

Re COMIT: Again, I never studied it, but it inspired Danny Bobrow to do a LISP version called Meteor. And that may have inspired Warren Teitelman to include a lot of pattern matching features into BBN LISP/INTERLISP>

Expand full comment

5 replies by John DeTreville and others

Feb 22, 2023Edited

"Interestingly, the first FORTRAN compiler was for the IBM 701 and so depended on software floating-point, but FORTRAN was not announced until the later IBM 704, which had hardware floating-point support." Actually, it was always for the IBM 704. Backus had worked on Speedcode for the 701, with software floating point and index registers. He realized he need to do more for the IBM 704, with its hardware floating point and three index registers, thus the Fortran proposal in late 1953.

See:

John Backus. 1978. The history of Fortran I, II, and III.

History of programming languages.

Association for Computing Machinery, New York, NY, USA, 25–74. https://doi.org/10.1145/800025.1198345;

1 reply by John DeTreville

10 more comments...

detreville

Discussion about this post