diff options
author | Bill Schmidt <wschmidt@linux.ibm.com> | 2012-07-25 03:07:08 +0000 |
---|---|---|
committer | Bill Schmidt <wschmidt@linux.ibm.com> | 2012-07-25 03:07:08 +0000 |
commit | e9679245cc242a87893411de9d1cd45573ce2895 (patch) | |
tree | ae51baf9345c38bdbb25531d72d91e057a404b0d /gcc/tree-vect-loop.c | |
parent | 1537a430151a1fe0aaffa360a6bf2ed3c27cffe0 (diff) |
2012-07-24 Bill Schmidt <wschmidt@linux.ibm.com>
* doc/tm.texi: Regenerate.
* targhooks.c (default_init_cost): Add prologue and epilogue costs.
(default_add_stmt_cost): Likewise; also handle NULL stmt_info.
(default_finish_cost): Add prologue and epilogue costs.
* targhooks.h (default_add_stmt_cost): Change parameter list.
(default_finish_cost): Likewise.
* target.def (init_cost): Change documentation string.
(add_stmt_cost): Change documentation string and parameter list.
(finish_cost): Likewise.
* target.h (vect_cost_model_location): New enum.
* tree-vectorizer.h (struct _slp_tree): Remove cost substruct.
(struct _slp_instance): Remove cost substruct; rename stmt_cost_vec
to body_cost_vec.
(SLP_INSTANCE_OUTSIDE_OF_LOOP_COST): Remove.
(SLP_INSTANCE_STMT_COST_VEC): Rename to SLP_INSTANCE_BODY_COST_VEC.
(SLP_TREE_OUTSIDE_OF_LOOP_COST): Remove.
(struct _vect_peel_extended_info): Rename stmt_cost_vec to
body_cost_vec.
(struct _stmt_vec_info): Remove cost substruct.
(STMT_VINFO_OUTSIDE_OF_LOOP_COST): Remove.
(stmt_vinfo_set_outside_of_loop_cost): Remove.
(builtin_vectorization_cost): New function.
(vect_get_stmt_cost): Change to use builtin_vectorization_cost.
(add_stmt_cost): Change parameter list.
(finish_cost): Likewise.
(vect_model_simple_cost): Likewise.
(vect_model_store_cost): Likewise.
(vect_model_load_cost): Likewise.
(record_stmt_cost): Likewise.
(vect_get_load_cost): Likewise.
(vect_get_known_peeling_cost): Likewise.
* tree-vect-loop.c (vect_get_known_peeling_cost): Change parameter
list; call record_stmt_cost for prologue and epilogue costs.
(vect_estimate_min_profitable_iters): Call add_stmt_cost for
prologue and epilogue costs; remove computation of vec_outside_cost;
return vec_prologue_cost and vec_epilogue_cost from finish_cost.
(vect_model_reduction_cost): Revise call to add_stmt_cost for body
costs; call add_stmt_cost for prologue and epilogue costs.
(vect_model_induction_cost): Revise call to add_stmt_cost for body
costs; call add_stmt_cost for prologue costs.
* tree-vect-data-refs.c (vect_get_data_access_cost): Change parameter
list for function and arguments for calls to vect_get_load_cost and
vect_get_store_cost.
(vect_peeling_hash_get_lowest_cost): Change argument list for calls to
vect_get_data_access_cost and vect_get_known_peeling_cost; use
temporary vectors prologue_cost_vec and epilogue_cost_vec for the
latter call and discard their results; rename stmt_cost_vec to
body_cost_vec; correct possible storage leak for body_cost_vec.
(vect_peeling_hash_choose_best_peeling): Rename stmt_cost_vec to
body_cost_vec.
(vect_enhance_data_refs_alignment): Rename stmt_cost_vec to
body_cost_vec; add extra dummy parameter on calls to
vect_get_data_access_cost; tolerate null si->stmt; add vect_body to
argument list on call to add_stmt_cost.
* tree-vect-stmts.c (record_stmt_cost): Change parameter list;
rename stmt_cost_vec to body_cost_vec; tolerate null stmt_info; call
builtin_vectorization_cost; add "where" parameter on call to
add_stmt_cost.
(vect_model_simple_cost): Change parameter list; call record_stmt_cost
for prologue costs; remove call to stmt_vinfo_set_outside_of_loop_cost;
rename stmt_cost_vec to body_cost_vec.
(vect_model_promotion_demotion_cost): Add vect_body argument to call
to add_stmt_cost; call add_stmt_cost for prologue costs; remove call
to stmt_vinfo_set_outside_of_loop_cost.
(vect_model_store_cost): Change parameter list; call record_stmt_cost
for prologue costs; add vect_body argument to call to record_stmt_cost;
rename stmt_cost_vec to body_cost_vec; remove call to
stmt_vinfo_set_outside_of_loop_cost.
(vect_get_store_cost): Rename stmt_cost_vec to body_cost_vec; add
vect_body argument to calls to record_stmt_cost.
(vect_model_load_cost): Change parameter list; rename stmt_cost_vec to
body_cost_vec; add vect_body argument to calls to record_stmt_cost;
remove call to stmt_vinfo_set_outside_of_loop_cost.
(vect_get_load_cost): Change parameter list; rename stmt_cost_vec to
body_cost_vec; add vect_body argument to calls to record_stmt_cost;
call record_stmt_cost for prologue costs.
(vectorizable_store): Change argument list for call to
vect_model_store_cost.
(vectorizable_load): Change argument list for call to
vect_model_load_cost.
(new_stmt_vec_info): Remove assignment to
STMT_VINFO_OUTSIDE_OF_LOOP_COST.
* config/spu/spu.c (spu_init_cost): Add prologue and epilogue costs.
(spu_add_stmt_cost): Likewise; also handle NULL stmt_info.
(spu_finish_cost): Add prologue and epilogue costs.
* config/i386/i386.c (i386_init_cost): Add prologue and epilogue costs.
(i386_add_stmt_cost): Likewise; also handle NULL stmt_info.
(i386_finish_cost): Add prologue and epilogue costs.
* config/rs6000/rs6000.c (rs6000_init_cost): Add prologue and epilogue
costs.
(rs6000_add_stmt_cost): Likewise; also handle NULL stmt_info.
(rs6000_finish_cost): Add prologue and epilogue costs.
* tree-vect-slp.c (vect_free_slp_instance): Rename
SLP_INSTANCE_STMT_COST_VEC to SLP_INSTANCE_BODY_COST_VEC.
(vect_create_new_slp_node): Remove assignment to
SLP_TREE_OUTSIDE_OF_LOOP_COST.
(vect_get_and_check_slp_defs): Change parameter list; change argument
lists to calls to vect_model_store_cost and vect_model_simple_cost.
(vect_build_slp_tree): Change parameter list; change argument lists
to calls to vect_model_load_cost, vect_get_and_check_slp_defs, and
recursive self-calls; remove setting of outside_cost from
SLP_TREE_OUTSIDE_OF_LOOP_COST; add vect_body argument to call to
record_stmt_cost.
(vect_analyze_slp_instance): Rename stmt_cost_vec to body_cost_vec;
rename SLP_INSTANCE_STMT_COST_VEC to SLP_INSTANCE_BODY_COST_VEC;
remove assignment to SLP_INSTANCE_OUTSIDE_OF_LOOP_COST; record SLP
prologue costs.
(vect_bb_vectorization_profitable_p): Rename stmt_cost_vec to
body_cost_vec; handle null ci->stmt; add vect_body argument to call
to add_stmt_cost; simplify calls to targetm.vectorize.
builtin_vectorization_cost; return vec_prologue_cost and
vec_epilogue_cost from finish_cost.
(vect_update_slp_costs_according_to_vf): Rename stmt_cost_vec to
body_cost_vec; add vect_body argument to call to add_stmt_cost.
git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@189836 138bc75d-0d04-0410-961f-82ee72b054a4
Diffstat (limited to 'gcc/tree-vect-loop.c')
-rw-r--r-- | gcc/tree-vect-loop.c | 234 |
1 files changed, 121 insertions, 113 deletions
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index 163bc573e37..d9094679dd2 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -2440,9 +2440,11 @@ vect_get_single_scalar_iteration_cost (loop_vec_info loop_vinfo) int vect_get_known_peeling_cost (loop_vec_info loop_vinfo, int peel_iters_prologue, int *peel_iters_epilogue, - int scalar_single_iter_cost) + int scalar_single_iter_cost, + stmt_vector_for_cost *prologue_cost_vec, + stmt_vector_for_cost *epilogue_cost_vec) { - int peel_guard_costs = 0; + int retval = 0; int vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo); if (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)) @@ -2455,7 +2457,8 @@ vect_get_known_peeling_cost (loop_vec_info loop_vinfo, int peel_iters_prologue, /* If peeled iterations are known but number of scalar loop iterations are unknown, count a taken branch per peeled loop. */ - peel_guard_costs = 2 * vect_get_stmt_cost (cond_branch_taken); + retval = record_stmt_cost (prologue_cost_vec, 2, cond_branch_taken, + NULL, 0, vect_prologue); } else { @@ -2469,9 +2472,15 @@ vect_get_known_peeling_cost (loop_vec_info loop_vinfo, int peel_iters_prologue, *peel_iters_epilogue = vf; } - return (peel_iters_prologue * scalar_single_iter_cost) - + (*peel_iters_epilogue * scalar_single_iter_cost) - + peel_guard_costs; + if (peel_iters_prologue) + retval += record_stmt_cost (prologue_cost_vec, + peel_iters_prologue * scalar_single_iter_cost, + scalar_stmt, NULL, 0, vect_prologue); + if (*peel_iters_epilogue) + retval += record_stmt_cost (epilogue_cost_vec, + *peel_iters_epilogue * scalar_single_iter_cost, + scalar_stmt, NULL, 0, vect_epilogue); + return retval; } /* Function vect_estimate_min_profitable_iters @@ -2486,22 +2495,18 @@ vect_get_known_peeling_cost (loop_vec_info loop_vinfo, int peel_iters_prologue, int vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo) { - int i; int min_profitable_iters; int peel_iters_prologue; int peel_iters_epilogue; - int vec_inside_cost = 0; + unsigned vec_inside_cost = 0; int vec_outside_cost = 0; + unsigned vec_prologue_cost = 0; + unsigned vec_epilogue_cost = 0; int scalar_single_iter_cost = 0; int scalar_outside_cost = 0; int vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo); - struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); - basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo); - int nbbs = loop->num_nodes; int npeel = LOOP_PEELING_FOR_ALIGNMENT (loop_vinfo); - int peel_guard_costs = 0; - VEC (slp_instance, heap) *slp_instances; - slp_instance instance; + void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo); /* Cost model disabled. */ if (!flag_vect_cost_model) @@ -2515,8 +2520,10 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo) if (LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo)) { /* FIXME: Make cost depend on complexity of individual check. */ - vec_outside_cost += - VEC_length (gimple, LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo)); + unsigned len = VEC_length (gimple, + LOOP_VINFO_MAY_MISALIGN_STMTS (loop_vinfo)); + (void) add_stmt_cost (target_cost_data, len, vector_stmt, NULL, 0, + vect_prologue); if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "cost model: Adding cost of checks for loop " "versioning to treat misalignment.\n"); @@ -2526,8 +2533,9 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo) if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo)) { /* FIXME: Make cost depend on complexity of individual check. */ - vec_outside_cost += - VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo)); + unsigned len = VEC_length (ddr_p, LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo)); + (void) add_stmt_cost (target_cost_data, len, vector_stmt, NULL, 0, + vect_prologue); if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "cost model: Adding cost of checks for loop " "versioning aliasing.\n"); @@ -2535,7 +2543,8 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo) if (LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo) || LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo)) - vec_outside_cost += vect_get_stmt_cost (cond_branch_taken); + (void) add_stmt_cost (target_cost_data, 1, cond_branch_taken, NULL, 0, + vect_prologue); /* Count statements in scalar loop. Using this as scalar cost for a single iteration for now. @@ -2545,52 +2554,6 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo) TODO: Consider assigning different costs to different scalar statements. */ - for (i = 0; i < nbbs; i++) - { - gimple_stmt_iterator si; - basic_block bb = bbs[i]; - - for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si)) - { - gimple stmt = gsi_stmt (si); - stmt_vec_info stmt_info = vinfo_for_stmt (stmt); - - if (STMT_VINFO_IN_PATTERN_P (stmt_info)) - { - stmt = STMT_VINFO_RELATED_STMT (stmt_info); - stmt_info = vinfo_for_stmt (stmt); - } - - /* Skip stmts that are not vectorized inside the loop. */ - if (!STMT_VINFO_RELEVANT_P (stmt_info) - && (!STMT_VINFO_LIVE_P (stmt_info) - || !VECTORIZABLE_CYCLE_DEF (STMT_VINFO_DEF_TYPE (stmt_info)))) - continue; - - /* FIXME: for stmts in the inner-loop in outer-loop vectorization, - some of the "outside" costs are generated inside the outer-loop. */ - vec_outside_cost += STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info); - if (is_pattern_stmt_p (stmt_info) - && STMT_VINFO_PATTERN_DEF_SEQ (stmt_info)) - { - gimple_stmt_iterator gsi; - - for (gsi = gsi_start (STMT_VINFO_PATTERN_DEF_SEQ (stmt_info)); - !gsi_end_p (gsi); gsi_next (&gsi)) - { - gimple pattern_def_stmt = gsi_stmt (gsi); - stmt_vec_info pattern_def_stmt_info - = vinfo_for_stmt (pattern_def_stmt); - if (STMT_VINFO_RELEVANT_P (pattern_def_stmt_info) - || STMT_VINFO_LIVE_P (pattern_def_stmt_info)) - vec_outside_cost - += STMT_VINFO_OUTSIDE_OF_LOOP_COST - (pattern_def_stmt_info); - } - } - } - } - scalar_single_iter_cost = vect_get_single_scalar_iteration_cost (loop_vinfo); /* Add additional cost for the peeled instructions in prologue and epilogue @@ -2621,18 +2584,54 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo) branch per peeled loop. Even if scalar loop iterations are known, vector iterations are not known since peeled prologue iterations are not known. Hence guards remain the same. */ - peel_guard_costs += 2 * (vect_get_stmt_cost (cond_branch_taken) - + vect_get_stmt_cost (cond_branch_not_taken)); - vec_outside_cost += (peel_iters_prologue * scalar_single_iter_cost) - + (peel_iters_epilogue * scalar_single_iter_cost) - + peel_guard_costs; + (void) add_stmt_cost (target_cost_data, 2, cond_branch_taken, + NULL, 0, vect_prologue); + (void) add_stmt_cost (target_cost_data, 2, cond_branch_not_taken, + NULL, 0, vect_prologue); + /* FORNOW: Don't attempt to pass individual scalar instructions to + the model; just assume linear cost for scalar iterations. */ + (void) add_stmt_cost (target_cost_data, + peel_iters_prologue * scalar_single_iter_cost, + scalar_stmt, NULL, 0, vect_prologue); + (void) add_stmt_cost (target_cost_data, + peel_iters_epilogue * scalar_single_iter_cost, + scalar_stmt, NULL, 0, vect_epilogue); } else { + stmt_vector_for_cost prologue_cost_vec, epilogue_cost_vec; + stmt_info_for_cost *si; + int j; + void *data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo); + + prologue_cost_vec = VEC_alloc (stmt_info_for_cost, heap, 2); + epilogue_cost_vec = VEC_alloc (stmt_info_for_cost, heap, 2); peel_iters_prologue = npeel; - vec_outside_cost += vect_get_known_peeling_cost (loop_vinfo, - peel_iters_prologue, &peel_iters_epilogue, - scalar_single_iter_cost); + + (void) vect_get_known_peeling_cost (loop_vinfo, peel_iters_prologue, + &peel_iters_epilogue, + scalar_single_iter_cost, + &prologue_cost_vec, + &epilogue_cost_vec); + + FOR_EACH_VEC_ELT (stmt_info_for_cost, prologue_cost_vec, j, si) + { + struct _stmt_vec_info *stmt_info + = si->stmt ? vinfo_for_stmt (si->stmt) : NULL; + (void) add_stmt_cost (data, si->count, si->kind, stmt_info, + si->misalign, vect_prologue); + } + + FOR_EACH_VEC_ELT (stmt_info_for_cost, epilogue_cost_vec, j, si) + { + struct _stmt_vec_info *stmt_info + = si->stmt ? vinfo_for_stmt (si->stmt) : NULL; + (void) add_stmt_cost (data, si->count, si->kind, stmt_info, + si->misalign, vect_epilogue); + } + + VEC_free (stmt_info_for_cost, heap, prologue_cost_vec); + VEC_free (stmt_info_for_cost, heap, epilogue_cost_vec); } /* FORNOW: The scalar outside cost is incremented in one of the @@ -2708,14 +2707,11 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo) } } - /* Add SLP costs. */ - slp_instances = LOOP_VINFO_SLP_INSTANCES (loop_vinfo); - FOR_EACH_VEC_ELT (slp_instance, slp_instances, i, instance) - vec_outside_cost += SLP_INSTANCE_OUTSIDE_OF_LOOP_COST (instance); + /* Complete the target-specific cost calculations. */ + finish_cost (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo), &vec_prologue_cost, + &vec_inside_cost, &vec_epilogue_cost); - /* Complete the target-specific cost calculation for the inside-of-loop - costs. */ - vec_inside_cost = finish_cost (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo)); + vec_outside_cost = (int)(vec_prologue_cost + vec_epilogue_cost); /* Calculate number of iterations required to make the vector version profitable, relative to the loop bodies only. The following condition @@ -2727,7 +2723,7 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo) PL_ITERS = prologue iterations, EP_ITERS= epilogue iterations SOC = scalar outside cost for run time cost model check. */ - if ((scalar_single_iter_cost * vf) > vec_inside_cost) + if ((scalar_single_iter_cost * vf) > (int) vec_inside_cost) { if (vec_outside_cost <= 0) min_profitable_iters = 1; @@ -2740,8 +2736,8 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo) - vec_inside_cost); if ((scalar_single_iter_cost * vf * min_profitable_iters) - <= ((vec_inside_cost * min_profitable_iters) - + ((vec_outside_cost - scalar_outside_cost) * vf))) + <= (((int) vec_inside_cost * min_profitable_iters) + + (((int) vec_outside_cost - scalar_outside_cost) * vf))) min_profitable_iters++; } } @@ -2761,8 +2757,10 @@ vect_estimate_min_profitable_iters (loop_vec_info loop_vinfo) fprintf (vect_dump, "Cost model analysis: \n"); fprintf (vect_dump, " Vector inside of loop cost: %d\n", vec_inside_cost); - fprintf (vect_dump, " Vector outside of loop cost: %d\n", - vec_outside_cost); + fprintf (vect_dump, " Vector prologue cost: %d\n", + vec_prologue_cost); + fprintf (vect_dump, " Vector epilogue cost: %d\n", + vec_epilogue_cost); fprintf (vect_dump, " Scalar iteration cost: %d\n", scalar_single_iter_cost); fprintf (vect_dump, " Scalar outside cost: %d\n", scalar_outside_cost); @@ -2803,7 +2801,7 @@ static bool vect_model_reduction_cost (stmt_vec_info stmt_info, enum tree_code reduc_code, int ncopies) { - int outer_cost = 0; + int prologue_cost = 0, epilogue_cost = 0; enum tree_code code; optab optab; tree vectype; @@ -2812,12 +2810,11 @@ vect_model_reduction_cost (stmt_vec_info stmt_info, enum tree_code reduc_code, enum machine_mode mode; loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); + void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo); /* Cost of reduction op inside loop. */ - unsigned inside_cost - = add_stmt_cost (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo), - ncopies, vector_stmt, stmt_info, 0); - + unsigned inside_cost = add_stmt_cost (target_cost_data, ncopies, vector_stmt, + stmt_info, 0, vect_body); stmt = STMT_VINFO_STMT (stmt_info); switch (get_gimple_rhs_class (gimple_assign_rhs_code (stmt))) @@ -2859,7 +2856,8 @@ vect_model_reduction_cost (stmt_vec_info stmt_info, enum tree_code reduc_code, code = gimple_assign_rhs_code (orig_stmt); /* Add in cost for initial definition. */ - outer_cost += vect_get_stmt_cost (scalar_to_vec); + prologue_cost += add_stmt_cost (target_cost_data, 1, scalar_to_vec, + stmt_info, 0, vect_prologue); /* Determine cost of epilogue code. @@ -2869,8 +2867,12 @@ vect_model_reduction_cost (stmt_vec_info stmt_info, enum tree_code reduc_code, if (!nested_in_vect_loop_p (loop, orig_stmt)) { if (reduc_code != ERROR_MARK) - outer_cost += vect_get_stmt_cost (vector_stmt) - + vect_get_stmt_cost (vec_to_scalar); + { + epilogue_cost += add_stmt_cost (target_cost_data, 1, vector_stmt, + stmt_info, 0, vect_epilogue); + epilogue_cost += add_stmt_cost (target_cost_data, 1, vec_to_scalar, + stmt_info, 0, vect_epilogue); + } else { int vec_size_in_bits = tree_low_cst (TYPE_SIZE (vectype), 1); @@ -2885,25 +2887,31 @@ vect_model_reduction_cost (stmt_vec_info stmt_info, enum tree_code reduc_code, if (VECTOR_MODE_P (mode) && optab_handler (optab, mode) != CODE_FOR_nothing && optab_handler (vec_shr_optab, mode) != CODE_FOR_nothing) - /* Final reduction via vector shifts and the reduction operator. Also - requires scalar extract. */ - outer_cost += ((exact_log2(nelements) * 2) - * vect_get_stmt_cost (vector_stmt) - + vect_get_stmt_cost (vec_to_scalar)); + { + /* Final reduction via vector shifts and the reduction operator. + Also requires scalar extract. */ + epilogue_cost += add_stmt_cost (target_cost_data, + exact_log2 (nelements) * 2, + vector_stmt, stmt_info, 0, + vect_epilogue); + epilogue_cost += add_stmt_cost (target_cost_data, 1, + vec_to_scalar, stmt_info, 0, + vect_epilogue); + } else - /* Use extracts and reduction op for final reduction. For N elements, - we have N extracts and N-1 reduction ops. */ - outer_cost += ((nelements + nelements - 1) - * vect_get_stmt_cost (vector_stmt)); + /* Use extracts and reduction op for final reduction. For N + elements, we have N extracts and N-1 reduction ops. */ + epilogue_cost += add_stmt_cost (target_cost_data, + nelements + nelements - 1, + vector_stmt, stmt_info, 0, + vect_epilogue); } } - STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) = outer_cost; - if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "vect_model_reduction_cost: inside_cost = %d, " - "outside_cost = %d .", inside_cost, - STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info)); + "prologue_cost = %d, epilogue_cost = %d .", inside_cost, + prologue_cost, epilogue_cost); return true; } @@ -2917,20 +2925,20 @@ static void vect_model_induction_cost (stmt_vec_info stmt_info, int ncopies) { loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); + void *target_cost_data = LOOP_VINFO_TARGET_COST_DATA (loop_vinfo); + unsigned inside_cost, prologue_cost; /* loop cost for vec_loop. */ - unsigned inside_cost - = add_stmt_cost (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo), ncopies, - vector_stmt, stmt_info, 0); + inside_cost = add_stmt_cost (target_cost_data, ncopies, vector_stmt, + stmt_info, 0, vect_body); /* prologue cost for vec_init and vec_step. */ - STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info) - = 2 * vect_get_stmt_cost (scalar_to_vec); + prologue_cost = add_stmt_cost (target_cost_data, 2, scalar_to_vec, + stmt_info, 0, vect_prologue); if (vect_print_dump_info (REPORT_COST)) fprintf (vect_dump, "vect_model_induction_cost: inside_cost = %d, " - "outside_cost = %d .", inside_cost, - STMT_VINFO_OUTSIDE_OF_LOOP_COST (stmt_info)); + "prologue_cost = %d .", inside_cost, prologue_cost); } |