# Transistor level implementation of a 4 phase input clock signal into an 8 phase output clock signal

Ben van Straalen

Abstract- A switching radio mixer can use a multi phase mixer to reject higher harmonics. Using 8 instead of 4 phases rejects more harmonics. An 8 phase mixer does require an 8 phase clock signal which is more complicated to generate then a 4 phase clock signal. This 8 phase clock can be generated from a 4 phase clock, this method will be compared to generating the 8 phase clock from a 2 phase clock in power consumption and speed. Both designs are windmill type clock dividers. The designs will first be explained on logic gate level, then implemented on transistor level using 180 nM MOSFETs in LTspice. Dividing from 2 phases to 8 phases turned out more power efficient then dividing from 4 phases, because dividing from 4 phases requires 3 input logic gates instead of 2 input logic gates, which are used when dividing from 2 phases.

#### I. INTRODUCTION

An 8 phase clock signal can be generated in different ways. Two different methods will be explored, generating the 8 phase signal from a 4 phase clock and generating the 8 phase signal from a 2 phase (regular) clock. These two different approaches will be compared in power consumption and speed with LTspice.

LTspice is not realistic enough for proper semiconductor simulations, however the results are still useful to compare the two different dividing approaches.

Since the different designs discussed in this paper are all windmill type clock dividers derived from Thijssens windmill [1]. The Thijssens windmill will be referred to as the 2-4 windmill, as it converts a 2 phase input clock to a 4 phase output clock. The other types of windmill dividers will be the 4-8 windmill and the 2-8 windmill.

# II. THEORY

The goal of the frequency dividers used for radio mixers is to generate a multi phase clock signal from a lower phase clock signal. In this case an 8 phase clock generated from either a 2 or 4 phase input clock. An 8 phase clock signal is illustrated in figure 1, where Qn are the different (output) clock signals and the dashed line mark 1 period. To generate an 8 phase clock signal from a 4 phase clock the input clock must be double the frequency of the output clock. To generate the 8 phase clock from a 2 phase input, the input clock has to be 4 times the frequency of the output clock.



1

Fig. 1: An 8 phase clock, notice that each signal has a 12,5% duty cycle so no signal is high at the same time

To generate these clock signals efficiently and with little delay from the input to the output, the input clock should only have to go trough 1 logic gate to generate the output clock signal. NOR gates will be used for this task. Together with the NAND gate the NOR gate is the most simple logic gate to implement on transistor level, thus also the fastest and most power efficient. Unlike the NAND gate the output of a NOR gate is only high at 1 of its input combinations. This makes the NOR gate ideal for this application as each clock signal should only have a duty cycle of 12,5% in an 8 phase clock.

# A. 2-4 windmill

In this section the windmill divider from Thijssens paper, about a BLE receiver, will be discussed and expanded upon with two 8 phase versions [1].

The 2-4 windmill divider uses NOR gates which pass each second clock cycle, the NOR gate generating output Q1 is shown in figure 2. EN1 is generated by a RS latch which is triggered by outputs Q2 and Q4 as shown in figures 3 and 4, this forms 1 module of the windmill. The fact that the RS latch is powered by other output signals makes up the circular design of the windmill divider and allows the 2-4 windmill to use very Little components. The complete 2-4 windmill is shown in figure 5, notice that 1 RS latch can be shared by two modules.



Fig. 2: The NOR gate generating Q1 in the 2-4 windmill



Fig. 3: 1 module of the 2-4 windmill



Fig. 4: The signals used in 1 module of the 2-4 windmill, the dashed lines mark 1 period



Fig. 5: The complete 2-4 windmill

# B. 2-8 windmill

The 2-8 windmill converts the 2 phase input clock into an 8 phase output clock, it does this by making the ring of the windmill twice as long. It consist of the modules shown in figure 6 and 7. The complete circuit is shown in figure 8. Unlike the 2-4 windmill the latches cannot be shared anymore because the enable signals have a duty cycle of 75% instead of the 50% of the 2-4 windmill, thus the other output of the latch has a duty cycle of 25%, which cannot be used on the opposing section, as can be done with the 2-4 windmill. The 2-8 windmill also needs a small reset circuit to start up in the right order, something the 2-4 windmill does not need. The reset circuit will be explained in section III-F.



Fig. 6: 1 module of the 2-8 windmill



Fig. 7: The signal used in 1 module of the 2-8 windmill

C. 4-8 windmill

The 4-8 windmill converts a 4 phase input clock to an 8 phase output clock, it does this by using 2 synchronised windmills as shown in figure 11. Because of the 4 phase input, the 4-8 windmill uses 3-input NOR gates, shown in figure 9 and 10, instead of the 2-input NOR gates found in the 2-4 and 2-8 windmill. The 4-8 windmill also needs a reset circuit to synchronise the two windmills when starting up. The reset circuit will be explained in section III-F.



Fig. 9: 1 module of the 4-8 windmill



Fig. 10: The signal used in 1 module of the 4-8 windmill



Fig. 8: The complete 2-8 windmill



Fig. 11: The two windmills which make up the complete 4-8 windmill

# III. SIMULATION

The 4-8 and 2-8 windmill dividers will be built and simulated in LTspice, the simulations will be used to compare the power consumption and the speed.

To measure how fast a NOR gate can switch, the time it takes the output signal to rise from 10% of the supply voltage to 90% of the supply voltage will be measured, the 10-90 rise time.

180nm MOSFETs will be used whose gate size can be adjusted. The supply voltage will be 1,8 volts.



Fig. 12: A 3-input NOR gate

# A 3-input NOR gate build from MOSFETs is shown in figure 12. The output is pulled low by each N MOSFETs and can only become high if all N MOSFETs are closed. In that case all the P MOSFETs are open and they pull the output high. This is the expected behaviour from a NOR gate, the output is only high if all inputs are low. The bulk of the P MOSFETs is connected to the supply voltage and the bulk of the N MOSFETs is connected to ground.

# A. Order of the P MOSFETs

In a 3-input NOR gate the P MOSFETs are in series while the N MOSFETs are in parallel as can be seen in figure 12. In the windmills only one of the P MOSFETs turns on, to turn the NOR gate on, while the other P MOSFETs are already turned on. The same also applies to the 2-input NOR gate. There is thus a choice for which P MOSFET should get the honour to turn on the NOR gate. Two scenarios will be examined, figure 13 where P3 turns the NOR gate on and figure 15 where P1 turns the NOR gate on.



Fig. 13: A 3-input NOR gate with the input connected to P3



Fig. 14: The voltages of different terminals of the 3-input NOR gate when P3 turns the NOR gate on



Fig. 15: A 3-input NOR gate with the input connected to P1



Fig. 16: The voltages of different terminals of the 3-input NOR gate when P1 turns the NOR gate on

In figure 13, P3 turns on the NOR gate, the source voltages of P2 and P1 also have to rise to the supply voltage. Since these terminals are at 800 mV when the NOR gate is turned off as can be seen in figure 14.

In figure 15, P1 turns on the NOR gate, the drain voltages of P3 and P2 are already at 1.8 Volt as can be seen in figure 16. This causes the output to rise faster then in figure 14, as can be seen in figure 17. Although the actual 10-90 rise time is similar, having P1 switch on the NOR gate consumes less power. So in the windmills the clock signal which will switch on the NOR gate, will be connected to IN1 which is connected to P1.



Fig. 17: The output voltages of both variations of the 3-input NOR gate including their current draw and the 10% and 90% lines

# B. Sizing MOSFETs

In LTspice the width and the length of the MOSFET gates can be adjusted. The length is the distance between the drain and source region in the MOSFET and the width is how wide the MOSFET is. Only the width will be adjusted in these simulations. Making the width larger decreases the ON resistance of drain source path, so the MOSFET can deliver more current, however since the gate becomes wider, it takes more current to charge and discharge the gate.

When the NOR gate has no output connected to it, the output is only loaded by the NOR gate itself. This means if all MOSFETs in the NOR gate have their gate widths made twice as large, the time it takes for the output to rise or fall remains almost equal. This is because the internal load the output sees, has also increased by the same number.

Increasing all the MOSFETs in size does however increase the power consumption, in figure 18 the rise and fall of two NOR gates have been plotted, where it can be seen that increasing the size of all MOSFETs results in more power and only a small increase in rise and fall time.



Fig. 18: The rise and fall of two NOR gates, with their power consumptions

#### C. Non switching MOSFETs

In figure 15 only N1 switches the NOR gate off, N2 and N3 can thus be decreased in size to save power, without impacting the fall time. This tactic does not work for P MOSFETs since they are in series. Increasing the on resistance of P2 and P3 will limit how much current P1 can source. Since the N MOSFETs are in parallel, N1 is not effected by the higher on resistance of N2 and N3.

In figure 19 the rise and fall times of the 3 scenarios have been plotted. Decreasing N2 and N3 decreases both the fall and rise slightly because the internal load is smaller. Decreasing P2 and P3 does not effect the fall time and the rise time is increased.



Fig. 19: the rise and fall of 3 different NOR gates, the descreased MOSFETs have been made 3 times smaller

# D. Larger P MOSFETs

N MOSFETs are more efficient then P MOSFETs, this means for the same size, the N MOSFET can supply more current. In a NOR gate the P MOSFETs are in series while the N MOSFETs are in parallel, therefore this mismatch between the P and N MOSFETs is even worse. This results in that the rise time is much longer then the fall time as can be seen in 18, where the P and N MOSFETs have equal size. The solution is to make the P MOSFETs bigger, until an equal rise and fall time is achieved, as we want the output clock signal to have an equal rise and fall time.

#### E. Load of the windmill

The windmill dividers drive the switch capacitors in the radio mixer. In LTspice the windmills will be loaded by N MOSFETs with an on resistance of 5 ohm, at a drain source voltage of 100 mV. This means the MOSFETs will have a greater gate source voltage then drain source voltage and operate in the triode region. Since the supply voltage of the windmill is 1.8 Volt

#### F. RS latch and resets

The latches that will be used in the windmills are RS NOR latches. This latch consists of two NOR gates connected output to input as shown in figure 20. Since both the 2-8 windmill and the 4-8 windmill need a reset circuit, the latches need to be resetable. To achieve this, 3-input NOR gates will be used where the third input is the reset signal. When this signal is low the latch functions as normal. When the reset signal is high the latch is locked. Only one of the outputs is locked however, since the reset signal can only force a NOR gate low. The reset still works though because the latch is unable to latch into the other state. For the 2-8 windmill the fact that only one of the latch outputs is locked is no problem since only one of the latch outputs is used. For the 4-8 windmill this results in that half of the output NOR gates are able to rise during reset.



Fig. 20: A RS NOR latch

In the latch, P3 will be made bigger then the other P MOSFETs. In chapter III-C it was shown that having different sized P MOSFETs had only negative consequences. However in this case it is beneficial, increasing P3 provides a lower on resistance, so the 3-input NOR gate can behave more like a 2-input NOR gate when P3 is turned on. The lower on resistance result in a bigger load for the reset signal when P3 is switched, but since the reset signal is static during normal operation, this does not hurt the performance of the latch.

For the 2-8 windmills all 8 latches are locked and released at the same time. 7 of the latches are locked so that the enable signal is high, only EN1 is low during reset. This means that only Q1 can be high, while the other output signals are forced low. After the reset is lifted, Q1 triggers the latch generating EN2 and EN2 becomes low, allowing Q2 to become high. This cascades until all latches are triggered in the correct order and the windmill turns in the correct order.

the 4-8 windmill consist of two 4 phase windmills which always start up correctly, however the two windmills have to be synchronised. Thus they are locked by the reset signal and the second windmill is released 1/8 of an output period later then the first windmill.

### G. Input clock buffer

The input clock signals will be made with a LTspice voltage source and a clock buffer consisting out of two inverters as shown in figure 21. Since LTspice voltage sources are ideal, the input clocks have to be buffered to generate a realistic clock. The power consumption of these buffers is also part of the power consumption of the circuit, so they have to be sized correctly.



Fig. 21: The clock buffers used in the windmills

#### H. Sizing the complete windmill

The windmills will be sized with the following steps

1) Scaling P to N MOSFETs: The first step is to size the P MOSFETs to the N MOSFETs, to get an equal rise and fall time at no load. The ratio between the size of P and N MOSFETs will need to change slightly once the load is added. This is done for the 3-input NOR gate, the 2-input NOR gate and the inverters in the clock buffer.

2) Sizing output NOR gate to the latch: When the windmill is assembled, the output NOR gates are loaded by the latches and the latches are loaded by the output NOR gates. Since the rise and fall time of the output NOR gates is more important then the rise and fall time of the latches, the latches can be made smaller then the output NOR gates, until the latches are just fast enough to drive the output NOR gates as shown in figure 22. At this point N2 and N3 (as shown in figure 15) in the output NOR gates can also be decreased to reduce power, this is not necessary in the latches, as both N MOSFETs are pulling the output down when switching state.

3) Clock buffers: The clock buffers are reduced in size until they are unable to drive the output NOR gates properly and the rise and fall time of the output clock signal is compromised. The first inverter in the clock buffer can be reduced further in size, since it only drives one small inverter instead of the 4 NOR gates, the second inverter has to drive.

4) Load: To each output NOR gate a load as described in section III-E is connected. The output NOR gates, latches and clock buffers have to be increased in size while keeping the same ratio in size between their MOSFETs. Thus all MOSFETs in the windmill will be scaled by a certain factor. The windmill has to be scaled up until it can drive the load with the desired rise and fall time. The ratio between rise and fall time of the output, will change slightly once the output is connected. Thus the ratio between the P and N MOSFETs in the output NOR gate has to be adjusted slightly. Because of the circular design of the windmill, finetunig the windmills is not trivial.



Fig. 22: Q1 and EN1 of the 500MHz 4-8 windmill and the 10 and 90% lines

Figure 22 shows the output of the 4-8 windmill, the output of the 2-8 windmill will look similar. Notice point A where there is a bump in the fall of EN1, at this point Q7, the trigger signal for the latch of EN1, turns off. Until point A, two N MOSFETs where pulling EN1 low, after point A, Q7 turns off and only one N MOSFET remains pulling down.

At point B there is a bump in Q1, at this point EN1 should prevent Q1 from turning on, however its corresponding N MOSFET in the output NOR is small and the other N MOSFETs are off. So it cannot prevent Q1 from rising a bit, notice also the dip in EN1 at this point in time. The peak at B is only 400 mV which is below the threshold voltage, so it cannot trigger the output or the latch Q1 is connected to.

#### IV. RESULTS

The 2-8 and 4-8 windmill have been simulated at 3 different output frequencies 250 MHz, 100 MHz and 25 MHz. Going to a lower frequency then 25 MHz is not useful because at 25 MHz, the MOSFETs in the RS latches have already reached their minimum size. For each frequency the windmills have been optimised, such that the output clock signals have a rise and fall time which is 1/5 of the on time of the signal, as shown in table 23 and 24. However the 4-8 windmill was not able to reach a 100 pico second rise and fall time. So for 250 MHz the windmills will have a rise fall time of 140 pico seconds.

The current has been measured over 10 cycles starting at the second cycle, as to not measure any start up currents. LTspice can automatically determine the average current and RMS current over a period. Since the power consumption needs to be determined the average current will be used. The power consumption has been split into the different parts of the windmill, the clock drivers, the output NOR gates, the RS latches and the power going into the output.

To determine if the windmills have any static currents, the same windmill is measured at different frequencies, the fast 4-8 windmill in this case. The current at 0 Hz can then be determined using a trend line, this is the static current. The static current was 0, as is expected for a fully dynamic circuit.

| 4-8 windmill            | high  | medium | low |
|-------------------------|-------|--------|-----|
| input frequency [Mhz]   | 500   | 200    | 50  |
| output frequency [MHz]  | 250   | 100    | 25  |
| output period [nS]      | 4     | 10     | 40  |
| output on time [nS]     | 0,5   | 1,25   | 5   |
| rise and fall time [nS] | *0,14 | 0,25   | 1   |

Fig. 23: The timings of the 4-8 windmills at a high, medium and low frequency

| 2-8 windmill            | high  | medium | low |
|-------------------------|-------|--------|-----|
| input frequency [Mhz]   | 1000  | 400    | 100 |
|                         |       |        |     |
| output frequency [MHz]  | 250   | 100    | 25  |
| output period [nS]      | 4     | 10     | 40  |
| output on time [nS]     | 0,5   | 1,25   | 5   |
| rise and fall time [nS] | *0.14 | 0.25   | 1   |

Fig. 24: The timings of the 2-8 windmill at a high, medium and low frequency

|              | 250MHz      |             | 100MHz      |             | 25Mhz       |             |
|--------------|-------------|-------------|-------------|-------------|-------------|-------------|
| current [uA] | 4-8windmill | 2-8windmill | 4-8windmill | 2-8windmill | 4-8windmill | 2-8windmill |
| Itot         | 10060,4     | 4915,6      | 1274,5      | 763,9       | 105,53      | 94          |
| Iclock       | 4139,6      | 2014,3      | 455,09      | 253,12      | 13,677      | 10,177      |
| llatch       | 1266,8      | 1063,7      | 102,65      | 95,511      | 4,8805      | 4,7172      |
| Inor         | 4654        | 1837,6      | 716,76      | 415,27      | 86,969      | 79,106      |
| lload        | 1,195       | 3,6955      | 0,54112     | 0,378       | 0,17454     | 0,12942     |

Fig. 25: The power consumption of each windmill version, split into the different circuits

As can be seen in figure 25, the 2-8 windmill is more efficient at each simulated frequency. At 25 MHz the 4-8 windmill spends 1,12 times more power, at 100 MHz 1,67 times the power. At 250 MHz the 4-8 windmill even spends 2,05 times more power then the 2-8 windmill. For visualisation figure 27 has been made, were the power consumption is plotted. The line connecting the points gives an indication what power consumption can be expected from frequencies between 25MHz and 250MHz. Higher frequencies cannot be achieved with this simulation and at lower frequencies the lower limit of the component size was met.

|          | 250         | MHz         | 100MHz      |             | 25Mhz       |             |
|----------|-------------|-------------|-------------|-------------|-------------|-------------|
| current% | 4-8windmill | 2-8windmill | 4-8windmill | 2-8windmill | 4-8windmill | 2-8windmill |
| clock%   | 41,15%      | 40,98%      | 35,71%      | 33,14%      | 12,96%      | 10,83%      |
| latch%   | 12,59%      | 21,64%      | 8,05%       | 12,50%      | 4,62%       | 5,02%       |
| nor%     | 46,26%      | 37,38%      | 56,24%      | 54,36%      | 82,41%      | 84,16%      |

Fig. 26: The percentages of the current each circuit in the different windmills consume, the current going into the load is not included as it negligible



Fig. 27: The current of the windmills plotted against the frequency, notice both axis are logarithmic



Fig. 28: Current graph of the 250 MHz 4-8 windmill, one of the outputs is shown as reference

Figure 28 shows the currents of the 4-8 windmill, the 2-8 windmill has a similar graph. The Vout line shows Q8 and Q1, the vertical bar in the Itot and Iclock graph is a peak current, this is because the first clock buffer receives the ideal clock signal from the voltage source. In an actual circuit (or a better simulation) this sharp peak would not be there. Notice that the latch current is small compared to the clock buffer and output NOR current, as the results in table 26 show.

#### V. DISCUSSION

The 2-8 windmill is much more efficient at 250 MHz although this difference is smaller at lower frequencies, the 2-8 windmill is still more efficient at every frequency.

# A. 2 input and 3 input NOR gates

The 2-8 windmill is most likely much more efficient because of the 3-input NOR gate in the 4-8 windmill. The 3-input NOR gate is in essence slower then the 2-input NOR gate in the 2-8 windmill, thus needs to be larger and consume more power to achieve the same rise and fall time. In figure 29 this difference is illustrated.



Fig. 29: Speed Comparison of the 2-input and 3-input NOR gate

In figure 29 the 2-input and 3-input NOR gate are compared, only the rise times are shown, since in fall times, there is less difference between the different NOR gates. The equal size NOR gates have equal sized P and N MOSFETs. In the optimised versions P1, P2 and P3 are 7,5 times larger then N1, N2 and N3 are 10 times smaller then N1 (MOSFET numbers as shown in figure 15).

All NOR gates in figure 29 achieve a rise time faster then 100 pico seconds, except for the equal size 3-input NOR gate. It was claimed however that the 3-input NOR gate in the windmill was not able to reach a 100 pico second rise time. The reason is, that in this example the NOR gates have no load, once the NOR gate is in circuit, it has to power the output load and the latch. When the NOR gate is made larger to be able to drive the load, the latch also has to be made larger to be able to drive the NOR gate. Hence the ideal scenario in this example cannot be achieved in the windmill. 8

# B. Power consumption of the different components

The power consumption of the windmills is divided in 3 categories, the clock buffer, the latch and the output NOR gates. The power going into the load is negligible compared to the power consumed by the windmill, as can be seen in table 25 hence it is omitted in table 26. The power going into the clock buffers from the external clock is even smaller then the power going into the load, so both can be neglected.

At the lower frequencies the output NOR uses the majority of the power, most likely because the load remains the same, although the power consumed by the load is still negligible. At high frequencies the percentage the other components use is higher. The 2-8 windmill spends relatively more power on its latches then the 4-8 windmill, most likely this is because it has double the latches, even though each latch only has to power 1 output NOR. The 4-8 windmill does spend more power on its latches in absolute numbers.

The 4-8 windmill spends relatively more power on its clock buffers, most likely because the 3-input NOR gates are more difficult to drive then the 2-input NOR gates. Furthermore both windmills have 8 NOR gates but in the 4-8 windmill, each NOR gate is driven by 2 clock signals.

#### C. Recommendations

The latches used are constructed out of 3-input NOR gates, because of the reset functionality needed. However the latch could also be constructed out of a 2-input NOR gate and a 3-input NOR gate. The latch can be locked by just pulling 1 of the NOR gates low with a reset signal. This latch could be more efficient then the 3-input NOR latches used. The latch would then however no longer be symmetric. This could be a problem for the 4-8 windmill, where it could drive 1 output NOR faster then the other. The 2-8 windmill is also effected since 1 of the latches needs to be locked vice versa.

# VI. CONCLUSION

The 2-8 windmill is faster at each simulated frequency. At lower frequencies the difference is small, where in the 4-8 windmill spends 1,12 times more power. But at high frequencies the difference is substantial where in the 4-8 windmill spends 2,05 times power. This big difference is in part due the fact that a 3-input NOR gate is slower in essence then a 2-input NOR gate. Since the application of the windmill divider is to drive a high frequency radio mixer, the 2-8 windmill is the obvious winner.

#### REFERENCES

 P. Q. Bart J. Thijssen, Eric A. M. Klumperink and B. Nauta, "2.4ghz highly selective iot receiver front end with power optimized lnta, frequency divider and baseband analog fir filter," *IEEE Solid-state circuits* magazine, 2021.