0% found this document useful (0 votes)
26 views

CSGC 342

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
26 views

CSGC 342

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 7
Birla Institute of Technology & Science Pilani, GOA Campus CS GC 342, Advanced Computer Organization Test 1, 2nd Semester, 2007-2008 Date: 21-02-2008 Time: 08:30AM-09:30AM Marks: 60 Q1. Fill in the blanks (Answers should be written on the first page of the answer book) (12x 2= 24] ‘The binary equivalent of -0.25,en is b) register is used as a link register by Jump and Link Instruction, °) offers a stable base register with in a procedure for local memory reference. d) Large number of registers in the architecture will the clock cycle time. e) The set of instructions that transfer the data between memory and registers are called instructions. f) The segment of the stack containing a procedure’s saved register and local variable are called g) In MIPS architecture register will act as $S8 register. h) The $f2 contain 0x7F800000 nex, Which isequivalentto i) component affects instruction count, clock rate and CPI. j) For AND instruction, the value of control bits for ALUOp is k) architecture uses same memory for storing instructions and data. 1) In source initiated data transfer the source does not send the next data item until data accepted signal is 2. Answers should be supported with suitable reasoning [ax 3= 12] Q2a — What would be the value of MSB in the target field for a backward jump instruction in MIPS? Q2b - “Right shifting by i equivalent to dividing by 2'" Q2c -In MIPS architecture which portion of memory will the machine code of the following C code occupy? # define pi 3.1428 Q2d- “Interconnection between the chip is faster than the Interconnection between chips”. Is it true?Q3 - An application program in C takes 12 seconds on a processor with its old compiler. A new compiler is released that require only 0.75 as many instructions as the old compiler. Unfortunately it increases the CPT by 1.33. How fast can we expect the application to run using the new compiler? {Ix 3=3} Q4 - Find the corrupted Instruction by converting the following machine code in to its corresponding MIPS instruction. [3x2=6] 1) 02484020 hex 2) 8D2802C0 hex 3) 01115100 hex. Q5 - Implement optimum code for the following without using MIPS subtraction instruction [2x3 =6] 1) $t3 = $t2 - $t1 2) $S4 = - ($S3) (negate the value of S3 and store it in $4) Q6 - Draw the data path diagram for store instruction with necessary elements and explain data storage operation steps [lx 9=9} [END]Birla Institute of Technology and Science, Pilani, Goa Campus _ CSGC 342 Test - 0 20%(60 Marks) Advanced Computer Closed Book | 07.04.08, Monday(8.30-9.30 a.m) Organization 1. A computer architect needs to design the pipeline for new processor. He has an example workload program fragment with 10° instructions. Each instruction takes 100 ps to finish. a. How long does it take to execute this program fragment on a nonpipelined processor? b. The current state-of-the-art microprocessor has about 20 pipeline stages. Assume it is perfectly pipelined. How much speedup will it achieve compared to the nonpipelined processor? 4 2. Consider a processor with 4 GB of memory address space and 256KB of direct-mapped cache. Assuming that the block size and hence cache line size is 32 bytes, find the main memory address format -6- 3. Consider a direct mapped cache with 64 lines and a block size of 8 bytes. a. What block number does byte Address 1200H and 1234H map to? b. Suppose the byte with address 1200H is stored in cache. What are the addresses of the other bytes stored along with it in the same cache line? “lle 4, Give one software and one hardware methods to solve data hazard problem in a pipelined datapath. 5 5. Consider the following pipeline reservation table 1 23 4] si [x x $2 (Xx $3 x a. What are the forbidden latencies and collision vector? b. Draw the state transition diagram. c. List all the simple cycles and greedy cycles d. Determine the minimal average latency. e. Let the pipeline clock period be 20 ns. Determine the throughput of this pipeline. -21- 6. Following figure shows 4-dimentional hypercube structure, a. Identify the addresses of all the nodes b. Show communication path between i, Pl to P8 ii___PSto P13 -14-Birla Institute of Technology and Science, Pilani, Goa Campus | CSGC 342 Comprehensive | 2007-08 (Sem 11) Advanced Computer Exam 30% (90 Marks) Organization Closed Book | 01.05.08, Thursday (2.00-5.00 p.m) | Important Note : i) Calculators are allowed but not to be shared ii) Use X for undetermined value 1, Fill in the blanks. an bes is the binary equivalent of 0.3d (Up to 5 digit accuracy) b. and are the two MIPS register other than $s0 to $57 whose value is preserved across the call In MIPS, subtract immediate is done using instruction d. No Re-type instruction will have both and fields a positive value simultaneously _ MIPS equivalent for the pseudo instruction “b target” is_ In a multicycle implementation the values of ALUsrcA and ALUsrcB in execution stage of beq instruction is _ and respectively. g. A MIPS instruction in the formet ‘op rl r2 r3° will help in shifting the value left, The contents 13 register immediately after the execution of the instruction will be if the contents of rl is 8 and 12 is 4. -l0- 2. Converting the following C code in to its corresponding MIPS instruction int d; Nine 1 me int fun(int, int, int ) int main() H { int a=5,b=10; static int = 20; // line 4 d= fun(a.b.c “/ line $ printf("%d",d); if line 6 return 0; } i line 7 int fun(int x,int y, int z) /f line 8 {return (x+y#z): } {I line 9 a. What would be the values of $a0 to $a3, $v0, $vI_and $ra register just before the execution of line 5, line 6, line 7 and line 9? Note that the MIPS instruction equivalent for the line 6, line 7, and line 8 are in memory location L1, L2 and L3 respectively. b. The value of variable ‘c’ and ‘d’ in the above program can be accessed by using _and registers -10- 3, Why is $f0 not hard wired to zero value? roe 4, Ina multicycle implementation what would be the contents of A and B register, if the instruction that is going to execute is 00af8020 hex? What are the control signals and their values used during the execute stage of above instruction ? 6- 5. In a multicycle implementation what would be the values of ALUsrcA and ‘ALUsrcB in every stage(from fetch to memory read completion) for a jump instruction. -10-A cache system has a 95 percent hit ratio, an ace A computer uses a memory unit with 256K words of 32 bits each. A binary of m ister y. The instruction has 4 parts: an part to specify one of 64 registers instruction code is stered in one w indirect bit, an operation code, 9 and an address part a. What is the size of the instruction? b. How many bits are there in the operation code, the register code part, and the address part? -243- s time of 100 ns on a cache hit and an access time of 800ns on a cache miss. What is the average time to access a word? -4- The following reservation table corresponds to a two function pipeline: (Jo Tt f2 13 7 ST[A Aqsa es B [A (s3lB [aB, TA | List all four cross forbidden sel of latencies, cross collision vectors and corresponding combined cross collis n matrices. 8 There is a 4-segment floating-point add ‘subtract pipeline, Assume that the time delays of the four segments are tl = 40 ns, 12 = 50 ns, 13 = 60 ns, «4 = 70 ns, and the interface registers have a delay of tr = 10 ns a. Determine the cycle time of pipeline, clock rate, and find the time taken to add 2000 pairs s pipeline b. What is the speedup numbers are added? . Draw a flow chart depicting the sequence of events that takes place in source ated data transfer using handshaking? “4 _How many switches and number of stages needed in a nxn omega switching 4. network? is .A parallel machine with a peak performance of 1.0 PetaFLOPS has to be constructed using SGI Altix nodes (8-way SMP). The clock speed of each processing element is 2.8 GHz and SGI guaransces that each of its CPU is capable of performing 8 FLOP in one sivsie el 2 elbck oye, How many such nodes will be required to realize this hyper rf? 4 60 Sun’s e450 workstations (4-way SMP} have been connected using cascaded Gigabit Ethernet. Each processor runs et a constant clock speed of 300 MHz. What is the speed that you as a manufacturer wil! guarantee this cluster will not exceed? 3. Serial version of SMITH WATERMAN (famous bio-informatics algorithm) has a time complexity of (log nj’. The MPI version has a time complexity of log (n) “#”. Which version is better? Be1 | switches: | Stayes: 2 s B - 14 15. State whether the following statements are TRUE (T) or FALSE (F):__-10- ‘Al Grid computing supports heterogeneous clusters [BI A computational grid should be ubiquitous | The g in Oracle 10g stands for grid_ D| MPMD expands to Multiple Programs Multiple Data E| PVM can be used to achieve explicit parallelism F| MPT cannot be used to communicate between processes belonging to the same site. (CPU utilization of @ parallel machine increases as Its clock speed increases, G Hi Single System Image for a parallel system can be readily achieved using any multi-processor operating system. Message passing model cannot be used for communicating between processes on a grid. Only UNIX-based clusters can be connected on a grid. K| The number of processes within a communication universe is internally decided by MPI. cl ‘Message passing paradigm is faster as compared to the shared memory model for intra-computer communication. . The SETI project is 2 classical example of P2P computing. ‘A good load balance leads to a good decomposition. (| MPI Send, MPI_Reev, MPI_Bcast are examples of system calls used for parallel rogramming. P| Turning huge amount of output data into pictures that a scientist can understand | is known as virtualization [ Q| Asynchronous communications is not supported by MPI. RI The applications that can be executed on a grid are different from those that can be executed on a cluster MPI_AllReduce call can be used to achieve synchronization 3 T| Using a parallel compiler to generate a parallel version of your serial code is an exemple of explicit parallelismBirla Institute of Technology and Science, Pilani, Goa Campus | CSGC 342 Comprehensive | 2007-08 (Sem II) | Advanced Computer Exam 30%(90 Marks) { Organization Closed Book | 01.05.08, Thursday (2.00-5.00 p.m) Name: ] | Recheck | IN 8 Marks | Recheck request: Ta - ti Lal ml 6 | 4, _Opeode | Register | Memory | | - | =! c | 7 _ | i a | Forbidden set : a | Faa= € Fap= f — Fpa= | z 1 Foe = Line |_5 To | Collision vector: "$a0 era | ' Sal |__| i | $a2 >|? fe 8 | a $vo | | Sv! a Sra_| t L i Ma= b - | |My= 3 \ ‘Cyele time = ao B | Clock rate = 4 | Control signals: a \ Time taken to add | 2000 pairs = ALUscA | ALUstcB | 9 iF | s (ID EXE g MEM MRC

You might also like