subject

Hw6-1 (43 points) suppose we wish to write a procedure that computes the inner productof two vectors u and v. an abstract version of the function has a cpe of 14{18 with x86-64 fordi erent types of integer andoating-point data. by doing the same sort of transformations we didto transform the abstract program combine1 into the more ecient combine4, we get the followingcode: void inner4(vec_ptr u, vec_ptr v, data t *dest) {long i; long length = vec_length(u); data_t *udata = get_vec_start(u); data_t *vdata = get_vec_start(v); data_t sum = (data_t) 0; for (i = 0; i < length; i++){sum = sum + udata[i] * vdata[i]; }*dest = sum; }our measurements show that this function has a cpe of 1.50 for integer data and 3.00 foroating-point data. for data type double, the x86-64 assembly code for the inner loop is asfollows: # inner loop of inner4. data_t = double. op = *.# udata in %rbp, vdata %rax, sum in %xmm0, i in rcx, limit in rbx. l15: # loop: vmovsd 0(%rbp,%rcx,8), %xmm1 # get udata[i]vmulsd (%rax,%rcx,8), %xmm1, %xmm1 # multiply by vdata[i]vaddsd %xmm1, %xmm0, %xmm0 # add to sumaddq $1, %rcx # increment icmpq %rbx, %rcx # compare i: limitjl .l15 # if < , goto loopassume that the functional units have the latencies and issue times given in figure 5.12 (andin the course notes).a. diagram how this instruction sequence would be decoded into operations, and show how the datadependencies between them would create a critical path of operations in the style of figures 5.13(figure: opt/dpb-sequential) and 5.14 (figure: opt/dpb-ow and figure: opt/dpb-ow-abstract). (25points.)b. for data type double, what lower bound on the cpe is determined by the critical path? givea numerical value and an explanation. (6 points.)c. assuming similar instruction sequences for the integer code as well, what lower bound on thecpe is determined by the critical path for integer data? give a numerical value and an explanation.(6 points.)d. explain how theoating-point version can have a cpe of 3.00 even though the multiplicationoperation requires 5 cycles. (6 points.)hw6-2 (27 points) write a version of the inner product procedure described in the previousproblem that uses six-way loop unrolling (6 1; no parallelism). (11 points.)

ansver
Answers: 1

Another question on Computers and Technology

question
Computers and Technology, 23.06.2019 02:30
Rafael needs to add a title row to a table that he has inserted in word. what should he do? use the alignment options. use the merge and center option for all the cells in the top row. use the merge and center option on the first two cells in the top row. none of the above
Answers: 3
question
Computers and Technology, 23.06.2019 05:20
What did creator markus “notch" persson initially call his game
Answers: 1
question
Computers and Technology, 24.06.2019 00:40
Use a software program or a graphing utility with matrix capabilities to solve the system of linear equations using an inverse matrix. x1 + 2x2 − x3 + 3x4 − x5 = 6 x1 − 3x2 + x3 + 2x4 − x5 = −6 2x1 + x2 + x3 − 3x4 + x5 = 3 x1 − x2 + 2x3 + x4 − x5 = −3 2x1 + x2 − x3 + 2x4 + x5 = 5
Answers: 3
question
Computers and Technology, 24.06.2019 12:30
Do you think media is stereotype ? and why?
Answers: 1
You know the right answer?
Hw6-1 (43 points) suppose we wish to write a procedure that computes the inner productof two vectors...
Questions
question
Mathematics, 06.10.2020 18:01
question
Mathematics, 06.10.2020 18:01
question
Mathematics, 06.10.2020 18:01
question
Mathematics, 06.10.2020 18:01
Questions on the website: 13722360