BD2 Text and Sol Draft
BD2 Text and Sol Draft
C. XML (9 p.)
<!ELEMENT Company ( Person+, Project+ )>
<!ELEMENT Person ( Name, Role, … )>
<!ATTLIST Person id ID #REQUIRED>
<!ELEMENT Project ( Name, StartDate, Activity+ )>
<!ELEMENT Activity ( Code, Task+, Activity* )>
<!ELEMENT Task ( InitiallyPlannedBudget, CompletionPercentage, CostSoFar )>
<!ATTLIST Task PersonInCharge IDREF #REQUIRED>
The DTD above describes the ongoing projects carried out by a company. Each project brakes down into activities, and each
activities into elementary tasks, and possibly other sub-activities. Each task is assigned to a specific person, in charge of reporting
the current progress status (of course a task may be completed spending less than the initially planned budget, or be more costly
than planned, even at an early stage). Unspecified elements only contain PCData. Extract in XQuery:
(3 p.) 1. The Name of the project(s) that involve(s) the largest number of people.
(6 p.) 2. A list of the projects sorted by cost-effectiveness, defined as the ratio between the planned overall cost and the
actual expected overall cost at full completion. Assume that each task will either cost (i) as much as planned, if
not start yet (i.e., if completed at 0%), or (ii) proportionally to the cost already incurred for the completed fraction.
Note that there are three lock upgrades and a lock downgrade (dashed).
C.1
let $maxcount := max( for $pj in doc(..)//Project
return count( distinct-values($pj//Task@PersonInCharge) ) )
for $p in doc(..)//Project
where count( distinct-values( $p//Task@PersonInCharge ) ) = $maxcount
return $p/Name
C.2
for $Proj in doc(..)//Project
let $Planned := sum( $Proj//Task/InitiallyPlannedBudget )
let $NotStarted := sum( $Proj//Task[CompletionPercentage=0]/InitiallyPlannedBudget )
let $Started := sum( for $t in $Proj//Task[ CompletionPercentage > 0 ]
return $t/CostSofar div $t/CompletionPercentage )
let $CostRatio := ( $NotStarted + $Started ) div $Planned
order by $CostRatio
return <Project>
{ $Proj/Name }
<PlannedCost> { $Planned } </PlannedCost>
<ExpectedCost> { $NotStarted + $Started } </ExpectedCost>
<Efficiency> { (1 - $CostRatio) * 100 , "%" } </Efficiency>
</Project>
D
Scenario (a)
1. Scan Ticket, lookup the PassId onto the B+ for the selected tickets to Rome, and check the date in main memory:
40K + 390K / val(To) ∙ ( 3 ) = 40K + 3K ∙ 3 = 49 K
2_dumb. Scan Passenger, scan the tickets for the selected Passengers born on 1/4/200, and count the tuples while scanning:
2K + 30K / val(To) ∙ ( 40K ) = 2K + 6 ∙ 40K = 242 K
2_smart. Scan Passenger, cache the PassIds of the FEW (6!) Passengers born on 1/4/200, and scan Tickets only once:
2K + 40K = 42 K
Scenario (b)
1. Both indexes are useless here, the cost is the same.
2. Find the 6 Passengers with the hash, then use the other hash to count their tickets (without retrieving the tuples!):
1 + 6 ∙ ( 1 + 1 ) = 13