diff options
author | H.J. Lu <hongjiu.lu@intel.com> | 2007-05-22 14:37:19 +0000 |
---|---|---|
committer | H.J. Lu <hongjiu.lu@intel.com> | 2007-05-22 14:37:19 +0000 |
commit | 4128d31ef3b202ec3234e9dad572edb0c847d85a (patch) | |
tree | 2c9aaab1ef7877eb59417c50602fc058c38c90d6 /gcc/doc | |
parent | 975aa68b087e9b443fc7df8d60f8fd8a08393d20 (diff) |
2007-05-22 H.J. Lu <hongjiu.lu@intel.com>
Richard Henderson <rth@redhat.com>
* config.gcc (i[34567]86-*-*): Add smmintrin.h to
extra_headers.
(x86_64-*-*): Likewise.
* i386/i386-modes.def (V2QI): New.
* config/i386/i386.c (ix86_handle_option): Handle SSE4.1 and
SSE4A.
(override_options): Support SSE4.1.
(IX86_BUILTIN_BLENDPD): New for SSE4.1.
(IX86_BUILTIN_BLENDPS): Likewise.
(IX86_BUILTIN_BLENDVPD): Likewise.
(IX86_BUILTIN_BLENDVPS): Likewise.
(IX86_BUILTIN_PBLENDVB128): Likewise.
(IX86_BUILTIN_PBLENDW128): Likewise.
(IX86_BUILTIN_DPPD): Likewise.
(IX86_BUILTIN_DPPS): Likewise.
(IX86_BUILTIN_INSERTPS128): Likewise.
(IX86_BUILTIN_MOVNTDQA): Likewise.
(IX86_BUILTIN_MPSADBW128): Likewise.
(IX86_BUILTIN_PACKUSDW128): Likewise.
(IX86_BUILTIN_PCMPEQQ): Likewise.
(IX86_BUILTIN_PHMINPOSUW128): Likewise.
(IX86_BUILTIN_PMAXSB128): Likewise.
(IX86_BUILTIN_PMAXSD128): Likewise.
(IX86_BUILTIN_PMAXUD128): Likewise.
(IX86_BUILTIN_PMAXUW128): Likewise.
(IX86_BUILTIN_PMINSB128): Likewise.
(IX86_BUILTIN_PMINSD128): Likewise.
(IX86_BUILTIN_PMINUD128): Likewise.
(IX86_BUILTIN_PMINUW128): Likewise.
(IX86_BUILTIN_PMOVSXBW128): Likewise.
(IX86_BUILTIN_PMOVSXBD128): Likewise.
(IX86_BUILTIN_PMOVSXBQ128): Likewise.
(IX86_BUILTIN_PMOVSXWD128): Likewise.
(IX86_BUILTIN_PMOVSXWQ128): Likewise.
(IX86_BUILTIN_PMOVSXDQ128): Likewise.
(IX86_BUILTIN_PMOVZXBW128): Likewise.
(IX86_BUILTIN_PMOVZXBD128): Likewise.
(IX86_BUILTIN_PMOVZXBQ128): Likewise.
(IX86_BUILTIN_PMOVZXWD128): Likewise.
(IX86_BUILTIN_PMOVZXWQ128): Likewise.
(IX86_BUILTIN_PMOVZXDQ128): Likewise.
(IX86_BUILTIN_PMULDQ128): Likewise.
(IX86_BUILTIN_PMULLD128): Likewise.
(IX86_BUILTIN_ROUNDPD): Likewise.
(IX86_BUILTIN_ROUNDPS): Likewise.
(IX86_BUILTIN_ROUNDSD): Likewise.
(IX86_BUILTIN_ROUNDSS): Likewise.
(IX86_BUILTIN_PTESTZ): Likewise.
(IX86_BUILTIN_PTESTC): Likewise.
(IX86_BUILTIN_PTESTNZC): Likewise.
(IX86_BUILTIN_VEC_EXT_V16QI): Likewise.
(IX86_BUILTIN_VEC_SET_V2DI): Likewise.
(IX86_BUILTIN_VEC_SET_V4SF): Likewise.
(IX86_BUILTIN_VEC_SET_V4SI): Likewise.
(IX86_BUILTIN_VEC_SET_V16QI): Likewise.
(bdesc_ptest): New.
(bdesc_sse_3arg): Likewise.
(bdesc_2arg): Likewise.
(bdesc_1arg): Likewise.
(ix86_init_mmx_sse_builtins): Support SSE4.1. Handle SSE builtins
with 3 args.
(ix86_expand_sse_4_operands_builtin): New.
(ix86_expand_unop_builtin): Support 2 arg builtins with a constant
smaller than 8 bits as the 2nd arg.
(ix86_expand_sse_ptest): New.
(ix86_expand_builtin): Support SSE4.1. Support 3 arg SSE builtins.
(ix86_expand_vector_set): Support SSE4.1.
(ix86_expand_vector_extract): Likewise.
* config/i386/i386.h (TARGET_CPU_CPP_BUILTINS): Define
__SSE4_1__ for -msse4.1.
* config/i386/i386.md (UNSPEC_BLENDV): New for SSE4.1.
(UNSPEC_INSERTPS): Likewise.
(UNSPEC_DP): Likewise.
(UNSPEC_MOVNTDQA): Likewise.
(UNSPEC_MPSADBW): Likewise.
(UNSPEC_PHMINPOSUW): Likewise.
(UNSPEC_PTEST): Likewise.
(UNSPEC_ROUNDP): Likewise.
(UNSPEC_ROUNDS): Likewise.
* config/i386/i386.opt (msse4.1): New for SSE4.1.
* config/i386/predicates.md (const_pow2_1_to_2_operand): New.
(const_pow2_1_to_32768_operand): Likewise.
* config/i386/smmintrin.h: New. The SSE4.1 intrinsic header
file.
* config/i386/sse.md (*vec_setv4sf_sse4_1): New pattern for
SSE4.1.
(sse4_1_insertps): Likewise.
(*sse4_1_extractps): Likewise.
(sse4_1_ptest): Likewise.
(sse4_1_mulv2siv2di3): Likewise.
(*sse4_1_mulv4si3): Likewise.
(*sse4_1_smax<mode>3): Likewise.
(*sse4_1_umax<mode>3): Likewise.
(*sse4_1_smin<mode>3): Likewise.
(*sse4_1_umin<mode>3): Likewise.
(sse4_1_eqv2di3): Likewise.
(*sse4_1_pinsrb): Likewise.
(*sse4_1_pinsrd): Likewise.
(*sse4_1_pinsrq): Likewise.
(*sse4_1_pextrb): Likewise.
(*sse4_1_pextrb_memory): Likewise.
(*sse4_1_pextrw_memory): Likewise.
(*sse4_1_pextrq): Likewise.
(sse4_1_blendpd): Likewise.
(sse4_1_blendps): Likewise.
(sse4_1_blendvpd): Likewise.
(sse4_1_blendvps): Likewise.
(sse4_1_dppd): Likewise.
(sse4_1_dpps): Likewise.
(sse4_1_movntdqa): Likewise.
(sse4_1_mpsadbw): Likewise.
(sse4_1_packusdw): Likewise.
(sse4_1_pblendvb): Likewise.
(sse4_1_pblendw): Likewise.
(sse4_1_phminposuw): Likewise.
(sse4_1_extendv8qiv8hi2): Likewise.
(*sse4_1_extendv8qiv8hi2): Likewise.
(sse4_1_extendv4qiv4si2): Likewise.
(*sse4_1_extendv4qiv4si2): Likewise.
(sse4_1_extendv2qiv2di2): Likewise.
(*sse4_1_extendv2qiv2di2): Likewise.
(sse4_1_extendv4hiv4si2): Likewise.
(*sse4_1_extendv4hiv4si2): Likewise.
(sse4_1_extendv2hiv2di2): Likewise.
(*sse4_1_extendv2hiv2di2): Likewise.
(sse4_1_extendv2siv2di2): Likewise.
(*sse4_1_extendv2siv2di2): Likewise.
(sse4_1_zero_extendv8qiv8hi2): Likewise.
(*sse4_1_zero_extendv8qiv8hi2): Likewise.
(sse4_1_zero_extendv4qiv4si2): Likewise.
(*sse4_1_zero_extendv4qiv4si2): Likewise.
(sse4_1_zero_extendv2qiv2di2): Likewise.
(*sse4_1_zero_extendv2qiv2di2): Likewise.
(sse4_1_zero_extendv4hiv4si2): Likewise.
(*sse4_1_zero_extendv4hiv4si2): Likewise.
(sse4_1_zero_extendv2hiv2di2): Likewise.
(*sse4_1_zero_extendv2hiv2di2): Likewise.
(sse4_1_zero_extendv2siv2di2): Likewise.
(*sse4_1_zero_extendv2siv2di2): Likewise.
(sse4_1_roundpd): Likewise.
(sse4_1_roundps): Likewise.
(sse4_1_roundsd): Likewise.
(sse4_1_roundss): Likewise.
(mulv4si3): Don't expand for SSE4.1.
(smax<mode>3): Likewise.
(umaxv4si3): Likewise.
(uminv16qi3): Likewise.
(umin<mode>3): Likewise.
(umaxv8hi3): Rewrite. Only enabled for SSE4.1.
* doc/extend.texi: Document SSE4.1 built-in functions.
* doc/invoke.texi: Document -msse4.1.
git-svn-id: https://gcc.gnu.org/svn/gcc/trunk@124945 138bc75d-0d04-0410-961f-82ee72b054a4
Diffstat (limited to 'gcc/doc')
-rw-r--r-- | gcc/doc/extend.texi | 78 | ||||
-rw-r--r-- | gcc/doc/invoke.texi | 8 |
2 files changed, 84 insertions, 2 deletions
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 2a310571b71..a09e4530977 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -7396,6 +7396,84 @@ v4si __builtin_ia32_pabsd128 (v4si) v8hi __builtin_ia32_pabsw128 (v8hi) @end smallexample +The following built-in functions are available when @option{-msse4.1} is +used. All of them generate the machine instruction that is part of the +name. + +@smallexample +v2df __builtin_ia32_blendpd (v2df, v2df, const int) +v4sf __builtin_ia32_blendps (v4sf, v4sf, const int) +v2df __builtin_ia32_blendvpd (v2df, v2df, v2df) +v4sf __builtin_ia32_blendvps (v4sf, v4sf, v4sf) +v2df __builtin_ia32_dppd (__v2df, __v2df, const int) +v4sf __builtin_ia32_dpps (v4sf, v4sf, const int) +v4sf __builtin_ia32_insertps128 (v4sf, v4sf, const int) +v2di __builtin_ia32_movntdqa (v2di *); +v16qi __builtin_ia32_mpsadbw128 (v16qi, v16qi, const int) +v8hi __builtin_ia32_packusdw128 (v4si, v4si) +v16qi __builtin_ia32_pblendvb128 (v16qi, v16qi, v16qi) +v8hi __builtin_ia32_pblendw128 (v8hi, v8hi, const int) +v2di __builtin_ia32_pcmpeqq (v2di, v2di) +v8hi __builtin_ia32_phminposuw128 (v8hi) +v16qi __builtin_ia32_pmaxsb128 (v16qi, v16qi) +v4si __builtin_ia32_pmaxsd128 (v4si, v4si) +v4si __builtin_ia32_pmaxud128 (v4si, v4si) +v8hi __builtin_ia32_pmaxuw128 (v8hi, v8hi) +v16qi __builtin_ia32_pminsb128 (v16qi, v16qi) +v4si __builtin_ia32_pminsd128 (v4si, v4si) +v4si __builtin_ia32_pminud128 (v4si, v4si) +v8hi __builtin_ia32_pminuw128 (v8hi, v8hi) +v4si __builtin_ia32_pmovsxbd128 (v16qi) +v2di __builtin_ia32_pmovsxbq128 (v16qi) +v8hi __builtin_ia32_pmovsxbw128 (v16qi) +v2di __builtin_ia32_pmovsxdq128 (v4si) +v4si __builtin_ia32_pmovsxwd128 (v8hi) +v2di __builtin_ia32_pmovsxwq128 (v8hi) +v4si __builtin_ia32_pmovzxbd128 (v16qi) +v2di __builtin_ia32_pmovzxbq128 (v16qi) +v8hi __builtin_ia32_pmovzxbw128 (v16qi) +v2di __builtin_ia32_pmovzxdq128 (v4si) +v4si __builtin_ia32_pmovzxwd128 (v8hi) +v2di __builtin_ia32_pmovzxwq128 (v8hi) +v2di __builtin_ia32_pmuldq128 (v4si, v4si) +v4si __builtin_ia32_pmulld128 (v4si, v4si) +int __builtin_ia32_ptestc128 (v2di, v2di) +int __builtin_ia32_ptestnzc128 (v2di, v2di) +int __builtin_ia32_ptestz128 (v2di, v2di) +v2df __builtin_ia32_roundpd (v2df, const int) +v4sf __builtin_ia32_roundps (v4sf, const int) +v2df __builtin_ia32_roundsd (v2df, v2df, const int) +v4sf __builtin_ia32_roundss (v4sf, v4sf, const int) +@end smallexample + +The following built-in functions are available when @option{-msse4.1} is +used. + +@table @code +@item v4sf __builtin_ia32_vec_set_v4sf (v4sf, float, const int) +Generates the @code{insertps} machine instruction. +@item int __builtin_ia32_vec_ext_v16qi (v16qi, const int) +Generates the @code{pextrb} machine instruction. +@item v16qi __builtin_ia32_vec_set_v16qi (v16qi, int, const int) +Generates the @code{pinsrb} machine instruction. +@item v4si __builtin_ia32_vec_set_v4si (v4si, int, const int) +Generates the @code{pinsrd} machine instruction. +@item v2di __builtin_ia32_vec_set_v2di (v2di, long long, const int) +Generates the @code{pinsrq} machine instruction in 64bit mode. +@end table + +The following built-in functions are changed to generate new SSE4.1 +instructions when @option{-msse4.1} is used. + +@table @code +@item float __builtin_ia32_vec_ext_v4sf (v4sf, const int) +Generates the @code{extractps} machine instruction. +@item int __builtin_ia32_vec_ext_v4si (v4si, const int) +Generates the @code{pextrd} machine instruction. +@item long long __builtin_ia32_vec_ext_v2di (v2di, const int) +Generates the @code{pextrq} machine instruction in 64bit mode. +@end table + The following built-in functions are available when @option{-msse4a} is used. @smallexample diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index d8260ba120b..21ef96cae7c 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -547,7 +547,8 @@ Objective-C and Objective-C++ Dialects}. -mno-fp-ret-in-387 -msoft-float @gol -mno-wide-multiply -mrtd -malign-double @gol -mpreferred-stack-boundary=@var{num} -mcx16 -msahf @gol --mmmx -msse -msse2 -msse3 -mssse3 -msse4a -m3dnow -mpopcnt -mabm @gol +-mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 @gol +-msse4a -m3dnow -mpopcnt -mabm @gol -mthreads -mno-align-stringops -minline-all-stringops @gol -mpush-args -maccumulate-outgoing-args -m128bit-long-double @gol -m96bit-long-double -mregparm=@var{num} -msseregparm @gol @@ -10260,6 +10261,8 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}. @itemx -mno-sse3 @item -mssse3 @itemx -mno-ssse3 +@item -msse4.1 +@itemx -mno-sse4.1 @item -msse4a @item -mno-sse4a @item -m3dnow @@ -10275,7 +10278,8 @@ preferred alignment to @option{-mpreferred-stack-boundary=2}. @opindex m3dnow @opindex mno-3dnow These switches enable or disable the use of instructions in the MMX, -SSE, SSE2, SSE3, SSSE3, SSE4A, ABM or 3DNow! extended instruction sets. +SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4A, ABM or 3DNow! extended +instruction sets. These extensions are also available as built-in functions: see @ref{X86 Built-in Functions}, for details of the functions enabled and disabled by these switches. |